Desktop productivity for business analysts and programmers

Creating dummy variables for multiple categories in a variable

Accepted Solution Solved
Reply
Regular Contributor
Posts: 196
Accepted Solution

Creating dummy variables for multiple categories in a variable

 I have a variable that has some, 1500 character categories, I want to create dummy variables for these categories. Is there any procedure I can use for the creating these variables. Manually it is quite a tiresome task.

 

 


Accepted Solutions
Solution
‎07-17-2016 01:52 AM
Contributor
Posts: 41

Re: Creating dummy variables for multiple categories in a variable

[ Edited ]
Posted in reply to munitech4u

 

%macro cat(indata,outdata, variable);
  proc sql noprint;
   select distinct &variable. 
    into :mvals separated by '|'
    from &indata.;
    %let mdim=&sqlobs;
  quit;

  data &outdata.;
    set &indata.;
    %do _i=1 %to &mdim.;
      %let _v = %scan(&mvals., &_i., |);
      if VType(&variable)='C' then do;
         if &variable. = "&_v." then &_v. = 1;
         else &_v = 0;
      end;
      else do;
         if &variable. = &_v. then &_v. = 1;
         else &_v = 0;
            end;
    %end;
  run;
%mend;

This is the way I have my macro set up:

 

 

Run the macro and then just put the name of the input dataset , the name of the output dataset, and the variable which holds the values you are creating the dummy variables for. 

 

%cat(have,want,variable)

 

 

 

Edited at 10:51 PDT. Forgot a ; 

View solution in original post


All Replies
Super User
Posts: 13,066

Re: Creating dummy variables for multiple categories in a variable

Posted in reply to munitech4u

You will need to provide some more details. Are you looking to create one level of dummy for each level that appears in the variable? Multiple variables with 0 / 1 coding for some levels? Groups of like values?

 

You could provide a some examples of what you are doing manually to give us some idea.

Regular Contributor
Posts: 196

Re: Creating dummy variables for multiple categories in a variable

So variable is like this:
var1
a
b
c
d
e
f
g

And I want in output dataset:
variables a,b,c,d,e,f,g and correponding dummy coded values as 1/0 wherever they are present/not present
Super User
Posts: 22,857

Re: Creating dummy variables for multiple categories in a variable

Posted in reply to munitech4u
Trusted Advisor
Posts: 1,270

Re: Creating dummy variables for multiple categories in a variable

Posted in reply to munitech4u

Hi,

 

 

Try proc glmmod with OUTDESIGN= to create dummy variables. 

Super User
Posts: 22,857

Re: Creating dummy variables for multiple categories in a variable

Posted in reply to munitech4u
Solution
‎07-17-2016 01:52 AM
Contributor
Posts: 41

Re: Creating dummy variables for multiple categories in a variable

[ Edited ]
Posted in reply to munitech4u

 

%macro cat(indata,outdata, variable);
  proc sql noprint;
   select distinct &variable. 
    into :mvals separated by '|'
    from &indata.;
    %let mdim=&sqlobs;
  quit;

  data &outdata.;
    set &indata.;
    %do _i=1 %to &mdim.;
      %let _v = %scan(&mvals., &_i., |);
      if VType(&variable)='C' then do;
         if &variable. = "&_v." then &_v. = 1;
         else &_v = 0;
      end;
      else do;
         if &variable. = &_v. then &_v. = 1;
         else &_v = 0;
            end;
    %end;
  run;
%mend;

This is the way I have my macro set up:

 

 

Run the macro and then just put the name of the input dataset , the name of the output dataset, and the variable which holds the values you are creating the dummy variables for. 

 

%cat(have,want,variable)

 

 

 

Edited at 10:51 PDT. Forgot a ; 

Occasional Contributor
Posts: 12

Re: Creating dummy variables for multiple categories in a variable

Posted in reply to SmcGarrett

In the above macro, I need to give variable name manually. But in some scenarios like I have 400 variables and out of those 400 variables, 90 variables are categorical variables. Then it's very difficult to check and picking all those variables.

 

Is there any code available to solve these kind of issues ?

 

Thanks in advance.

Super User
Posts: 10,618

Re: Creating dummy variables for multiple categories in a variable

Posted in reply to munitech4u

It is easy for IML.



proc iml;
use sashelp.class;
read all var {sex};
close;
vnames=unique(sex);
d=design(sex);
create want from d[r=sex c=vnames];
append from d[r=sex];
close;
quit;
proc print;run;

☑ This topic is solved.

Need further help from the community? Please ask a new question.

Discussion stats
  • 8 replies
  • 6031 views
  • 3 likes
  • 7 in conversation