Hi All,
I am trying to create a new combination of variables, but I don't seem to get the right code in the community form. I would really appreciate it if anyone could help me with it.
I have the following data:
data have;
input Id $ var1 Cat1 Var2 Cat2 Category1 Category2 variable1;
datalines;
1 1 0 0 0 0 0 0
2 1 0 0 0 0 0 1
3 0 0 1 1 0 0 0
4 0 0 0 0 0 0 0
5 1 0 0 0 1 1 1
6 1 1 1 1 1 1 1
7 1 1 1 0 1 1 0
8 0 1 0 0 0 0 0
9 0 0 1 0 0 0 0
10 0 0 0 1 0 0 0
11 0 0 0 0 1 0 0
12 0 0 0 0 0 1 0
13 0 0 0 0 0 0 1
;
run;
I want to create a new variable that is a combination of yes (1) categories for each ID as in the following table:
Id |
var1 |
Cat1 |
Var2 |
Cat2 |
Category1 |
Category2 |
variable1 |
New Variable |
1 |
1 |
0 |
0 |
0 |
0 |
0 |
0 |
Var1 |
2 |
1 |
0 |
0 |
0 |
0 |
0 |
1 |
Var1+Variable1 |
3 |
0 |
0 |
1 |
1 |
0 |
0 |
0 |
Var2+Cat2 |
4 |
0 |
0 |
0 |
0 |
0 |
0 |
0 |
None |
5 |
1 |
0 |
0 |
0 |
1 |
1 |
1 |
>3 variables |
6 |
1 |
1 |
1 |
1 |
1 |
1 |
1 |
>3 variables |
7 |
1 |
1 |
1 |
0 |
1 |
1 |
0 |
>3 variables |
8 |
0 |
1 |
0 |
0 |
0 |
0 |
0 |
Cat1 |
9 |
0 |
0 |
1 |
0 |
0 |
0 |
0 |
Var2 |
10 |
0 |
0 |
0 |
1 |
0 |
0 |
0 |
Cat2 |
11 |
0 |
0 |
0 |
0 |
1 |
0 |
0 |
Category1 |
12 |
0 |
0 |
0 |
0 |
0 |
1 |
0 |
Category2 |
13 |
0 |
0 |
0 |
0 |
0 |
0 |
1 |
variable1 |
When there are more than 3 "Yes (1)" in a category then we assign them as >3 variables. I really appreciate any help.
Thanks,
Sandyzman1
Below should return the desired result.
data have;
input Id $ var1 Cat1 Var2 Cat2 Category1 Category2 variable1;
datalines;
1 1 0 0 0 0 0 0
2 1 0 0 0 0 0 1
3 0 0 1 1 0 0 0
4 0 0 0 0 0 0 0
5 1 0 0 0 1 1 1
6 1 1 1 1 1 1 1
7 1 1 1 0 1 1 0
8 0 1 0 0 0 0 0
9 0 0 1 0 0 0 0
10 0 0 0 1 0 0 0
11 0 0 0 0 1 0 0
12 0 0 0 0 0 1 0
13 0 0 0 0 0 0 1
;
data want(drop=_:);
set have;
array vars {*} var1 Cat1 Var2 Cat2 Category1 Category2 variable1;
length new_variable $98;
do _i=1 to dim(vars);
if vars[_i]=1 then
do;
_n=sum(_n,1);
if _n>3 then
do;
new_variable='>3 variables';
leave;
end;
else
do;
new_variable=catx('+',new_variable,vname(vars[_i]));
end;
end;
end;
if missing(new_variable) then new_variable='None';
run;
proc print data=want;
run;
Below should return the desired result.
data have;
input Id $ var1 Cat1 Var2 Cat2 Category1 Category2 variable1;
datalines;
1 1 0 0 0 0 0 0
2 1 0 0 0 0 0 1
3 0 0 1 1 0 0 0
4 0 0 0 0 0 0 0
5 1 0 0 0 1 1 1
6 1 1 1 1 1 1 1
7 1 1 1 0 1 1 0
8 0 1 0 0 0 0 0
9 0 0 1 0 0 0 0
10 0 0 0 1 0 0 0
11 0 0 0 0 1 0 0
12 0 0 0 0 0 1 0
13 0 0 0 0 0 0 1
;
data want(drop=_:);
set have;
array vars {*} var1 Cat1 Var2 Cat2 Category1 Category2 variable1;
length new_variable $98;
do _i=1 to dim(vars);
if vars[_i]=1 then
do;
_n=sum(_n,1);
if _n>3 then
do;
new_variable='>3 variables';
leave;
end;
else
do;
new_variable=catx('+',new_variable,vname(vars[_i]));
end;
end;
end;
if missing(new_variable) then new_variable='None';
run;
proc print data=want;
run;
@sandyzman1 Just re-read your question.
You need to change the following condition
from:
to:
and the length of new_variable must become $98 to avoid string truncation - 3*32 characters (max variable name length) and two plus signs.
I've already fixed the code in my first answer accordingly.
Join us for SAS Innovate 2025, our biggest and most exciting global event of the year, in Orlando, FL, from May 6-9. Sign up by March 14 for just $795.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.