Hi All,
I am trying to create a new combination of variables, but I don't seem to get the right code in the community form. I would really appreciate it if anyone could help me with it.
I have the following data:
data have;
input Id $ var1 Cat1 Var2 Cat2 Category1 Category2 variable1;
datalines;
1 1 0 0 0 0 0 0
2 1 0 0 0 0 0 1
3 0 0 1 1 0 0 0
4 0 0 0 0 0 0 0
5 1 0 0 0 1 1 1
6 1 1 1 1 1 1 1
7 1 1 1 0 1 1 0
8 0 1 0 0 0 0 0
9 0 0 1 0 0 0 0
10 0 0 0 1 0 0 0
11 0 0 0 0 1 0 0
12 0 0 0 0 0 1 0
13 0 0 0 0 0 0 1
;
run;
I want to create a new variable that is a combination of yes (1) categories for each ID as in the following table:
Id |
var1 |
Cat1 |
Var2 |
Cat2 |
Category1 |
Category2 |
variable1 |
New Variable |
1 |
1 |
0 |
0 |
0 |
0 |
0 |
0 |
Var1 |
2 |
1 |
0 |
0 |
0 |
0 |
0 |
1 |
Var1+Variable1 |
3 |
0 |
0 |
1 |
1 |
0 |
0 |
0 |
Var2+Cat2 |
4 |
0 |
0 |
0 |
0 |
0 |
0 |
0 |
None |
5 |
1 |
0 |
0 |
0 |
1 |
1 |
1 |
>3 variables |
6 |
1 |
1 |
1 |
1 |
1 |
1 |
1 |
>3 variables |
7 |
1 |
1 |
1 |
0 |
1 |
1 |
0 |
>3 variables |
8 |
0 |
1 |
0 |
0 |
0 |
0 |
0 |
Cat1 |
9 |
0 |
0 |
1 |
0 |
0 |
0 |
0 |
Var2 |
10 |
0 |
0 |
0 |
1 |
0 |
0 |
0 |
Cat2 |
11 |
0 |
0 |
0 |
0 |
1 |
0 |
0 |
Category1 |
12 |
0 |
0 |
0 |
0 |
0 |
1 |
0 |
Category2 |
13 |
0 |
0 |
0 |
0 |
0 |
0 |
1 |
variable1 |
When there are more than 3 "Yes (1)" in a category then we assign them as >3 variables. I really appreciate any help.
Thanks,
Sandyzman1
Below should return the desired result.
data have;
input Id $ var1 Cat1 Var2 Cat2 Category1 Category2 variable1;
datalines;
1 1 0 0 0 0 0 0
2 1 0 0 0 0 0 1
3 0 0 1 1 0 0 0
4 0 0 0 0 0 0 0
5 1 0 0 0 1 1 1
6 1 1 1 1 1 1 1
7 1 1 1 0 1 1 0
8 0 1 0 0 0 0 0
9 0 0 1 0 0 0 0
10 0 0 0 1 0 0 0
11 0 0 0 0 1 0 0
12 0 0 0 0 0 1 0
13 0 0 0 0 0 0 1
;
data want(drop=_:);
set have;
array vars {*} var1 Cat1 Var2 Cat2 Category1 Category2 variable1;
length new_variable $98;
do _i=1 to dim(vars);
if vars[_i]=1 then
do;
_n=sum(_n,1);
if _n>3 then
do;
new_variable='>3 variables';
leave;
end;
else
do;
new_variable=catx('+',new_variable,vname(vars[_i]));
end;
end;
end;
if missing(new_variable) then new_variable='None';
run;
proc print data=want;
run;
Below should return the desired result.
data have;
input Id $ var1 Cat1 Var2 Cat2 Category1 Category2 variable1;
datalines;
1 1 0 0 0 0 0 0
2 1 0 0 0 0 0 1
3 0 0 1 1 0 0 0
4 0 0 0 0 0 0 0
5 1 0 0 0 1 1 1
6 1 1 1 1 1 1 1
7 1 1 1 0 1 1 0
8 0 1 0 0 0 0 0
9 0 0 1 0 0 0 0
10 0 0 0 1 0 0 0
11 0 0 0 0 1 0 0
12 0 0 0 0 0 1 0
13 0 0 0 0 0 0 1
;
data want(drop=_:);
set have;
array vars {*} var1 Cat1 Var2 Cat2 Category1 Category2 variable1;
length new_variable $98;
do _i=1 to dim(vars);
if vars[_i]=1 then
do;
_n=sum(_n,1);
if _n>3 then
do;
new_variable='>3 variables';
leave;
end;
else
do;
new_variable=catx('+',new_variable,vname(vars[_i]));
end;
end;
end;
if missing(new_variable) then new_variable='None';
run;
proc print data=want;
run;
@sandyzman1 Just re-read your question.
You need to change the following condition
from:
to:
and the length of new_variable must become $98 to avoid string truncation - 3*32 characters (max variable name length) and two plus signs.
I've already fixed the code in my first answer accordingly.
Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.