BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
kmorrowvt
Obsidian | Level 7

Trying to determine if there is a more efficient way to accomplish turning a categorical variable with 3 values into 3 binary variables:

if na_resp in (0,1,2) then
  do;
    na_resp_0=(na_resp=0);
    na_resp_1=(na_resp=1);  
    na_resp_2=(na_resp=2);
  end;

 

I have several variables with the same structure and would like to use an array to transform them all, but I can't figure out if it is possible to create the 3 new variables I need automatically from the variables in the array, i.e. I would need to create na_resp_0, na_resp_1, na_resp_2 for the first variable. Is it possible to affix a suffix to an array value? Something like:
array nurse (*) na_resp na_retract na_cyan na_airent na_grunt;
  do i=1 to dim(nurse);
    if nurse(i) in (0,1,2) then
     do j=0 to 2;
      nurse(i)_j=(nurse(i)=j);
    end;
  end;

1 ACCEPTED SOLUTION

Accepted Solutions
Kurt_Bremser
Super User

Make a list of variables, and use a macro to create the code:

data have;
input id $ na_resp na_retract;
datalines;
A 0 2
B 1 1
C 2 0
;

%let varlist=na_resp na_retract;

%macro make_binary(vars);
%do i = 1 %to %sysfunc(countw(&vars));
  %let var = %scan(&vars,&i);
  if &var in (0,1,2)
  then do;
    &var._0=(&var.=0);
    &var._1=(&var.=1);  
    &var._2=(&var.=2);
  end;
%end;
%mend;

data want;
set have;
%make_binary(&varlist);
run;

 

View solution in original post

5 REPLIES 5
Kurt_Bremser
Super User

Make a list of variables, and use a macro to create the code:

data have;
input id $ na_resp na_retract;
datalines;
A 0 2
B 1 1
C 2 0
;

%let varlist=na_resp na_retract;

%macro make_binary(vars);
%do i = 1 %to %sysfunc(countw(&vars));
  %let var = %scan(&vars,&i);
  if &var in (0,1,2)
  then do;
    &var._0=(&var.=0);
    &var._1=(&var.=1);  
    &var._2=(&var.=2);
  end;
%end;
%mend;

data want;
set have;
%make_binary(&varlist);
run;

 

kmorrowvt
Obsidian | Level 7

Thanks, I was trying to mix macros and arrays and having no luck.This should do the trick.

Kurt_Bremser
Super User

You can make the code more efficient by replacing the many implicit IFs with a single SELECT statement:

%macro make_binary(vars);
%do i = 1 %to %sysfunc(countw(&vars));
  %let var = %scan(&vars,&i);
  select (&var);
  %do j = 0 %to 2;
    when (&j.) &var._&j. = 1;
  %end;
    otherwise;
  end;
%end;
%mend;

If you do not like the missing values for the "unset" variables, you can set them to 0 in a separate %DO loop. Unconditional assignments eat considerably less CPU than conditions.

Ksharp
Super User

It is generating a design matrix. @Rick_SAS  wrote a blog about it a couple of years ago;

data have;
input id $ na_resp na_retract;
datalines;
A 0 2
B 1 1
C 2 0
;
data temp;
 set have;
 dummy=1;
run;
proc logistic data=temp outdesign=x(drop=dummy) outdesignonly noprint;
class  na_resp na_retract /param=glm;
model dummy=na_resp na_retract/nofit noint;
run;
data want;
 merge have x;
run;
Rick_SAS
SAS Super FREQ

There is no need to write a program or use a macro. As KSharp says, this is called creating a design matrix, and SAS provides many ways to generate a design matrix.

 

If you want the names of the dummy variables to reflect the name and values of the original categori..., I suggest using PROC GLMSELECT, as follows:

 

/* add a fake response variable */
data AddY / view=AddY;
set have;
_Y = 1;
run;

proc glmselect data=AddY NOPRINT
               outdesign(addinputvars)=Want2(drop=_Y);
   class      na_resp na_retract;   /* list the categorical variables here */
   model _Y = na_resp na_retract /  /* and here */
              noint selection=none;
run;

proc print data=Want2; run;