I have a variable that has some, 1500 character categories, I want to create dummy variables for these categories. Is there any procedure I can use for the creating these variables. Manually it is quite a tiresome task.
%macro cat(indata,outdata, variable);
proc sql noprint;
select distinct &variable.
into :mvals separated by '|'
from &indata.;
%let mdim=&sqlobs;
quit;
data &outdata.;
set &indata.;
%do _i=1 %to &mdim.;
%let _v = %scan(&mvals., &_i., |);
if VType(&variable)='C' then do;
if &variable. = "&_v." then &_v. = 1;
else &_v = 0;
end;
else do;
if &variable. = &_v. then &_v. = 1;
else &_v = 0;
end;
%end;
run;
%mend;
This is the way I have my macro set up:
Run the macro and then just put the name of the input dataset , the name of the output dataset, and the variable which holds the values you are creating the dummy variables for.
%cat(have,want,variable)
Edited at 10:51 PDT. Forgot a ;
You will need to provide some more details. Are you looking to create one level of dummy for each level that appears in the variable? Multiple variables with 0 / 1 coding for some levels? Groups of like values?
You could provide a some examples of what you are doing manually to give us some idea.
There are a couple of solutions here:
Hi,
Try proc glmmod with OUTDESIGN= to create dummy variables.
%macro cat(indata,outdata, variable);
proc sql noprint;
select distinct &variable.
into :mvals separated by '|'
from &indata.;
%let mdim=&sqlobs;
quit;
data &outdata.;
set &indata.;
%do _i=1 %to &mdim.;
%let _v = %scan(&mvals., &_i., |);
if VType(&variable)='C' then do;
if &variable. = "&_v." then &_v. = 1;
else &_v = 0;
end;
else do;
if &variable. = &_v. then &_v. = 1;
else &_v = 0;
end;
%end;
run;
%mend;
This is the way I have my macro set up:
Run the macro and then just put the name of the input dataset , the name of the output dataset, and the variable which holds the values you are creating the dummy variables for.
%cat(have,want,variable)
Edited at 10:51 PDT. Forgot a ;
In the above macro, I need to give variable name manually. But in some scenarios like I have 400 variables and out of those 400 variables, 90 variables are categorical variables. Then it's very difficult to check and picking all those variables.
Is there any code available to solve these kind of issues ?
Thanks in advance.
It is easy for IML. proc iml; use sashelp.class; read all var {sex}; close; vnames=unique(sex); d=design(sex); create want from d[r=sex c=vnames]; append from d[r=sex]; close; quit; proc print;run;
Don't miss out on SAS Innovate - Register now for the FREE Livestream!
Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.
What’s the difference between SAS Enterprise Guide and SAS Studio? How are they similar? Just ask SAS’ Danny Modlin.
Find more tutorials on the SAS Users YouTube channel.