DATA Step, Macro, Functions and more

how to use macro and loop to create multiple dummy variables for glm

Reply
Super Contributor
Posts: 345

how to use macro and loop to create multiple dummy variables for glm

I need to create multiple dummy variables, for age, I need to create 3 dummy varibles, for group, 4, for race, 2, for edu, 3.

 

So I am thinking to use a loop macro way

%macro glan(demog);
%let var1 = group;
	%let var2 = race;
%let var3=edu;

data want;
  set have;
  array edu(3) edu1-edu3;
  do i = 1 to 3;
    edu(i)=(educat=i);
  end;
drop i;
run;

I guess I can use nested loop but not sure how to do it. The problem is, each variable has different catogories and need different numbers of dummy variables.

 

Any advice to make use of the array structure to make our coding more efficient? Thanks.

Super User
Posts: 5,498

Re: how to use macro and loop to create multiple dummy variables for glm

Why would you create dummy variables at all?  GLM can handle this for you, with the CLASS statement:

 

https://support.sas.com/documentation/cdl/en/statug/63033/HTML/default/viewer.htm#statug_glm_sect011...

 

Super Contributor
Posts: 345

Re: how to use macro and loop to create multiple dummy variables for glm

Posted in reply to Astounding

Thank you. I want to use those as my confounding factors.

 For example, for race, only black is related and will be considered as a confounding.

For group, one group II, will be considered as my confounding.

So my final model will be

 

Y=aX+ g(2)+ race(b)

So I do not want to add too many that are not related to my model

any advice?

Super User
Posts: 5,498

Re: how to use macro and loop to create multiple dummy variables for glm

If you really only need a handful of them, just hard-code them:

 

data want;

set have;

dummy1 = (group=2);

dummy2 = (race='black');

run;

 

You don't have to create all possible variables, just those that you will need in your equation.

Super Contributor
Posts: 345

Re: how to use macro and loop to create multiple dummy variables for glm

[ Edited ]
Posted in reply to Astounding

Thanks. This is one solution, but not exactly what I want. I need to do multiple analysis, so those dummy variables can be generated for other analyses, not only limited to glm. There are different models. I really want to try the nested loop and macro. Or, if only one, either nested loop, or macro, can work, that will be also great.

Super User
Posts: 5,498

Re: how to use macro and loop to create multiple dummy variables for glm

Here are a few considerations, then.

 

Do you have any long variable names?  It would be convenient to name the dummy variables after the originals, such as race_1, race_2, race_3, etc.  But that can fail if the original variable names are too long.

 

Do the values of any variables contain more than one word? 

 

Could any variables have a missing value?  If so, should a dummy be assigned for the missing values?

Super Contributor
Posts: 345

Re: how to use macro and loop to create multiple dummy variables for glm

Posted in reply to Astounding

The variable names some times are as long as 7 letters

yes, some values contain more than one word, like "HS diploma"

yes, some contain missing values. Missing values can be ommited, since that is a very small number.

thanks

Super User
Posts: 5,498

Re: how to use macro and loop to create multiple dummy variables for glm

[ Edited ]

One last question.  Is it reasonable to ask you to call a macro with a separate list of character vs. numeric variables?  For example:

 

%make_dummies (charlist=edu, numlist=age race)

 

If that is too much work, you would still need to somehow indicate which variables to use for creating dummies, and which to omit.

 

Basically, a macro will find all possible values for your variables, create the dummy variables, and label them.

 

 

And if Tom is suggesting GLMMOD, I would definitely look into that ... very credible source!

Super Contributor
Posts: 345

Re: how to use macro and loop to create multiple dummy variables for glm

Posted in reply to Astounding

yes

 

%make_dummies (charlist=edu group race, numlist=age )

 

Super User
Posts: 5,498

Re: how to use macro and loop to create multiple dummy variables for glm

OK, this will be a bit clumsy.

 

%macro make_dummies (charlist=, numlist=);

%local i j nextvar;

proc freq data=have noprint;

%do i=1 %to %sysfunc(countw(&charlist &numlist);

   %let nextvar = %scan(&charlist &numlist, &i);

   tables &nextvar / out=&nextvar (keep=&nextvar);

%end;

run;

 

data _null_;

call execute ('data want; set have;');

%if %length(&charlist) %then %do i=1 %to %sysfunc(countw(&charlist));

%let nextvar = %scan(&charlist, &i);

   counter=0;

   do until (done&i);

      set &nextvar (where=(&nextvar > ' ')) end=done&i;

      counter + 1;

      call execute(cats("&nextvar._", counter, "= (&nextvar=", &nextvar, ');',

                                  "label &nextvar._", counter, "= '&nextvar';"));

   end;

%end;

%** Similar loop for numerics;

call execute('run;');

stop;

run;

 

%mend make_dummies;

 

It's getting complicated enough that I thought I should stop here before writing the loop for numeric variables.  It needs to be tested and debugged, and perhaps scrapped if Tom's solution is easier.  So it can be tested using character variables only for now.

 

Super Contributor
Posts: 345

Re: how to use macro and loop to create multiple dummy variables for glm

Posted in reply to Astounding

Thank you very much. It is very long and I am lost. Which should be replaced with my own variables? Which should be kept?

Super User
Posts: 5,498

Re: how to use macro and loop to create multiple dummy variables for glm

You would need to put your data into a data set named "have" since that name is hard-coded within the macro.  (That part can always be changed later by convertting the data set name into a macro parameter.)

 

Then call the macro by naming a few character variables:

 

%make_dummies (charlist=name1 name2 name3)

Super User
Super User
Posts: 7,039

Re: how to use macro and loop to create multiple dummy variables for glm

Why don't you just use PROC GLMMOD to do it for you?

Super User
Super User
Posts: 7,039

Re: how to use macro and loop to create multiple dummy variables for glm

[ Edited ]
%let varlist=age sex ;
%let dsin=sashelp.class;
%let dsout=want ;

data step1 ;
  _id +1;
  set &dsin (keep=&varlist);
run;

proc glmmod noprint data=step1 outparm=step2 outdesign=step3 ;
  class &varlist;
  model _id=&varlist /noint;
run;

data step4 ;
  set step2;
  old=cats('COL',_colnum_);
  new=cats(of effname &varlist);
run;

proc sql noprint;
  select catx('=',old,new)
    into :rename separated by ' '
    from step4 
  ;
quit;

data &dsout ;
  set &dsin ;
  set step3 (drop=_id  rename=(&rename)) ;
run;




Ask a Question
Discussion stats
  • 13 replies
  • 529 views
  • 0 likes
  • 3 in conversation