Trying to determine if there is a more efficient way to accomplish turning a categorical variable with 3 values into 3 binary variables:
if na_resp in (0,1,2) then
do;
na_resp_0=(na_resp=0);
na_resp_1=(na_resp=1);
na_resp_2=(na_resp=2);
end;
I have several variables with the same structure and would like to use an array to transform them all, but I can't figure out if it is possible to create the 3 new variables I need automatically from the variables in the array, i.e. I would need to create na_resp_0, na_resp_1, na_resp_2 for the first variable. Is it possible to affix a suffix to an array value? Something like:
array nurse (*) na_resp na_retract na_cyan na_airent na_grunt;
do i=1 to dim(nurse);
if nurse(i) in (0,1,2) then
do j=0 to 2;
nurse(i)_j=(nurse(i)=j);
end;
end;
Make a list of variables, and use a macro to create the code:
data have;
input id $ na_resp na_retract;
datalines;
A 0 2
B 1 1
C 2 0
;
%let varlist=na_resp na_retract;
%macro make_binary(vars);
%do i = 1 %to %sysfunc(countw(&vars));
%let var = %scan(&vars,&i);
if &var in (0,1,2)
then do;
&var._0=(&var.=0);
&var._1=(&var.=1);
&var._2=(&var.=2);
end;
%end;
%mend;
data want;
set have;
%make_binary(&varlist);
run;
Make a list of variables, and use a macro to create the code:
data have;
input id $ na_resp na_retract;
datalines;
A 0 2
B 1 1
C 2 0
;
%let varlist=na_resp na_retract;
%macro make_binary(vars);
%do i = 1 %to %sysfunc(countw(&vars));
%let var = %scan(&vars,&i);
if &var in (0,1,2)
then do;
&var._0=(&var.=0);
&var._1=(&var.=1);
&var._2=(&var.=2);
end;
%end;
%mend;
data want;
set have;
%make_binary(&varlist);
run;
Thanks, I was trying to mix macros and arrays and having no luck.This should do the trick.
You can make the code more efficient by replacing the many implicit IFs with a single SELECT statement:
%macro make_binary(vars);
%do i = 1 %to %sysfunc(countw(&vars));
%let var = %scan(&vars,&i);
select (&var);
%do j = 0 %to 2;
when (&j.) &var._&j. = 1;
%end;
otherwise;
end;
%end;
%mend;
If you do not like the missing values for the "unset" variables, you can set them to 0 in a separate %DO loop. Unconditional assignments eat considerably less CPU than conditions.
It is generating a design matrix. @Rick_SAS wrote a blog about it a couple of years ago;
data have; input id $ na_resp na_retract; datalines; A 0 2 B 1 1 C 2 0 ; data temp; set have; dummy=1; run; proc logistic data=temp outdesign=x(drop=dummy) outdesignonly noprint; class na_resp na_retract /param=glm; model dummy=na_resp na_retract/nofit noint; run; data want; merge have x; run;
There is no need to write a program or use a macro. As KSharp says, this is called creating a design matrix, and SAS provides many ways to generate a design matrix.
If you want the names of the dummy variables to reflect the name and values of the original categori..., I suggest using PROC GLMSELECT, as follows:
/* add a fake response variable */
data AddY / view=AddY;
set have;
_Y = 1;
run;
proc glmselect data=AddY NOPRINT
outdesign(addinputvars)=Want2(drop=_Y);
class na_resp na_retract; /* list the categorical variables here */
model _Y = na_resp na_retract / /* and here */
noint selection=none;
run;
proc print data=Want2; run;
SAS' Charu Shankar shares her PROC SQL expertise by showing you how to master the WHERE clause using real winter weather data.
Find more tutorials on the SAS Users YouTube channel.