Hi All,
I am trying to create a new combination of variables, but I don't seem to get the right code in the community form. I would really appreciate it if anyone could help me with it.
I have the following data:
data have;
input Id $ var1 Cat1 Var2 Cat2 Category1 Category2 variable1;
datalines;
1 1 0 0 0 0 0 0
2 1 0 0 0 0 0 1
3 0 0 1 1 0 0 0
4 0 0 0 0 0 0 0
5 1 0 0 0 1 1 1
6 1 1 1 1 1 1 1
7 1 1 1 0 1 1 0
8 0 1 0 0 0 0 0
9 0 0 1 0 0 0 0
10 0 0 0 1 0 0 0
11 0 0 0 0 1 0 0
12 0 0 0 0 0 1 0
13 0 0 0 0 0 0 1
;
run;
I want to create a new variable that is a combination of yes (1) categories for each ID as in the following table:
Id |
var1 |
Cat1 |
Var2 |
Cat2 |
Category1 |
Category2 |
variable1 |
New Variable |
1 |
1 |
0 |
0 |
0 |
0 |
0 |
0 |
Var1 |
2 |
1 |
0 |
0 |
0 |
0 |
0 |
1 |
Var1+Variable1 |
3 |
0 |
0 |
1 |
1 |
0 |
0 |
0 |
Var2+Cat2 |
4 |
0 |
0 |
0 |
0 |
0 |
0 |
0 |
None |
5 |
1 |
0 |
0 |
0 |
1 |
1 |
1 |
>3 variables |
6 |
1 |
1 |
1 |
1 |
1 |
1 |
1 |
>3 variables |
7 |
1 |
1 |
1 |
0 |
1 |
1 |
0 |
>3 variables |
8 |
0 |
1 |
0 |
0 |
0 |
0 |
0 |
Cat1 |
9 |
0 |
0 |
1 |
0 |
0 |
0 |
0 |
Var2 |
10 |
0 |
0 |
0 |
1 |
0 |
0 |
0 |
Cat2 |
11 |
0 |
0 |
0 |
0 |
1 |
0 |
0 |
Category1 |
12 |
0 |
0 |
0 |
0 |
0 |
1 |
0 |
Category2 |
13 |
0 |
0 |
0 |
0 |
0 |
0 |
1 |
variable1 |
When there are more than 3 "Yes (1)" in a category then we assign them as >3 variables. I really appreciate any help.
Thanks,
Sandyzman1
Below should return the desired result.
data have;
input Id $ var1 Cat1 Var2 Cat2 Category1 Category2 variable1;
datalines;
1 1 0 0 0 0 0 0
2 1 0 0 0 0 0 1
3 0 0 1 1 0 0 0
4 0 0 0 0 0 0 0
5 1 0 0 0 1 1 1
6 1 1 1 1 1 1 1
7 1 1 1 0 1 1 0
8 0 1 0 0 0 0 0
9 0 0 1 0 0 0 0
10 0 0 0 1 0 0 0
11 0 0 0 0 1 0 0
12 0 0 0 0 0 1 0
13 0 0 0 0 0 0 1
;
data want(drop=_:);
set have;
array vars {*} var1 Cat1 Var2 Cat2 Category1 Category2 variable1;
length new_variable $98;
do _i=1 to dim(vars);
if vars[_i]=1 then
do;
_n=sum(_n,1);
if _n>3 then
do;
new_variable='>3 variables';
leave;
end;
else
do;
new_variable=catx('+',new_variable,vname(vars[_i]));
end;
end;
end;
if missing(new_variable) then new_variable='None';
run;
proc print data=want;
run;
Below should return the desired result.
data have;
input Id $ var1 Cat1 Var2 Cat2 Category1 Category2 variable1;
datalines;
1 1 0 0 0 0 0 0
2 1 0 0 0 0 0 1
3 0 0 1 1 0 0 0
4 0 0 0 0 0 0 0
5 1 0 0 0 1 1 1
6 1 1 1 1 1 1 1
7 1 1 1 0 1 1 0
8 0 1 0 0 0 0 0
9 0 0 1 0 0 0 0
10 0 0 0 1 0 0 0
11 0 0 0 0 1 0 0
12 0 0 0 0 0 1 0
13 0 0 0 0 0 0 1
;
data want(drop=_:);
set have;
array vars {*} var1 Cat1 Var2 Cat2 Category1 Category2 variable1;
length new_variable $98;
do _i=1 to dim(vars);
if vars[_i]=1 then
do;
_n=sum(_n,1);
if _n>3 then
do;
new_variable='>3 variables';
leave;
end;
else
do;
new_variable=catx('+',new_variable,vname(vars[_i]));
end;
end;
end;
if missing(new_variable) then new_variable='None';
run;
proc print data=want;
run;
@sandyzman1 Just re-read your question.
You need to change the following condition
from:
to:
and the length of new_variable must become $98 to avoid string truncation - 3*32 characters (max variable name length) and two plus signs.
I've already fixed the code in my first answer accordingly.
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.