Desktop productivity for business analysts and programmers

Creating dummy variables for multiple categories in a variable

Accepted Solution Solved
Reply
Regular Contributor
Posts: 188
Accepted Solution

Creating dummy variables for multiple categories in a variable

 I have a variable that has some, 1500 character categories, I want to create dummy variables for these categories. Is there any procedure I can use for the creating these variables. Manually it is quite a tiresome task.

 

 


Accepted Solutions
Solution
‎07-17-2016 01:52 AM
Contributor
Posts: 34

Re: Creating dummy variables for multiple categories in a variable

[ Edited ]

 

%macro cat(indata,outdata, variable);
  proc sql noprint;
   select distinct &variable. 
    into :mvals separated by '|'
    from &indata.;
    %let mdim=&sqlobs;
  quit;

  data &outdata.;
    set &indata.;
    %do _i=1 %to &mdim.;
      %let _v = %scan(&mvals., &_i., |);
      if VType(&variable)='C' then do;
         if &variable. = "&_v." then &_v. = 1;
         else &_v = 0;
      end;
      else do;
         if &variable. = &_v. then &_v. = 1;
         else &_v = 0;
            end;
    %end;
  run;
%mend;

This is the way I have my macro set up:

 

 

Run the macro and then just put the name of the input dataset , the name of the output dataset, and the variable which holds the values you are creating the dummy variables for. 

 

%cat(have,want,variable)

 

 

 

Edited at 10:51 PDT. Forgot a ; 

View solution in original post


All Replies
Super User
Posts: 11,139

Re: Creating dummy variables for multiple categories in a variable

You will need to provide some more details. Are you looking to create one level of dummy for each level that appears in the variable? Multiple variables with 0 / 1 coding for some levels? Groups of like values?

 

You could provide a some examples of what you are doing manually to give us some idea.

Regular Contributor
Posts: 188

Re: Creating dummy variables for multiple categories in a variable

So variable is like this:
var1
a
b
c
d
e
f
g

And I want in output dataset:
variables a,b,c,d,e,f,g and correponding dummy coded values as 1/0 wherever they are present/not present
Super User
Posts: 19,181

Re: Creating dummy variables for multiple categories in a variable

Trusted Advisor
Posts: 1,228

Re: Creating dummy variables for multiple categories in a variable

Hi,

 

 

Try proc glmmod with OUTDESIGN= to create dummy variables. 

Super User
Posts: 19,181

Re: Creating dummy variables for multiple categories in a variable

Solution
‎07-17-2016 01:52 AM
Contributor
Posts: 34

Re: Creating dummy variables for multiple categories in a variable

[ Edited ]

 

%macro cat(indata,outdata, variable);
  proc sql noprint;
   select distinct &variable. 
    into :mvals separated by '|'
    from &indata.;
    %let mdim=&sqlobs;
  quit;

  data &outdata.;
    set &indata.;
    %do _i=1 %to &mdim.;
      %let _v = %scan(&mvals., &_i., |);
      if VType(&variable)='C' then do;
         if &variable. = "&_v." then &_v. = 1;
         else &_v = 0;
      end;
      else do;
         if &variable. = &_v. then &_v. = 1;
         else &_v = 0;
            end;
    %end;
  run;
%mend;

This is the way I have my macro set up:

 

 

Run the macro and then just put the name of the input dataset , the name of the output dataset, and the variable which holds the values you are creating the dummy variables for. 

 

%cat(have,want,variable)

 

 

 

Edited at 10:51 PDT. Forgot a ; 

Super User
Posts: 9,878

Re: Creating dummy variables for multiple categories in a variable


It is easy for IML.



proc iml;
use sashelp.class;
read all var {sex};
close;
vnames=unique(sex);
d=design(sex);
create want from d[r=sex c=vnames];
append from d[r=sex];
close;
quit;
proc print;run;

☑ This topic is solved.

Need further help from the community? Please ask a new question.

Discussion stats
  • 7 replies
  • 2186 views
  • 3 likes
  • 6 in conversation