BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
SmcGarrett
Obsidian | Level 7

I need a do loop that will read a categorical variable and then make an indicator variable for each category. Everything I've found on the web relies on the categorical variable being equal to some number which is not the case for me. I do not want to hard code these dummy variables because they will change from one data set to the next and the values will continually be different. Also, the number of categories can be in the 100s. 

 Cat3.JPG

 

Cat4.JPG

 

 

All help would be so greatly appreciated. 

 

The code below is the closest that I have gotten to getting this to work. It just doesn't work because of the numberical problem. 

 

 

 

 

%macro cat(indata, variable);
  proc sql noprint;
    select distinct &variable. into :mvals separated by '|'
    from &indata.;

    %let mdim=&sqlobs;
  quit;

  data &indata.;
    set &indata.;
    %do _i=1 %to &mdim.;
      %let _v = %scan(&mvals., &_i., |);
      if &variable. = &_v. then &variable.&_v. = 1; else &variable.&_v = 0;
    %end;
  run;
%mend;

%cat(people, income);

 

1 ACCEPTED SOLUTION

Accepted Solutions
ballardw
Super User

For this approach to work you need to make sure that the length of text resulting from &variable.&_v.

never exceeds 32 characters. It also may not contain spaces or charcters other than letters, digits and _.

The function NVALID  would allow you validate whether &variable.&_v creates a valid name.

You may also run into problems creating a variable name that already exists in your dataset.

 

You can use the VTYPE function to execute code conditionally. You have an issue with how character and numeric variables would behave.

 

I am very leery of using:

Data &indata;

   Set &indata;

construct in this context as there is a potential for destroying existing data if the &variable.&_v duplicates an existing variable.

 

This code addresses the Character/Numeric issue.

%macro cat(indata, variable);
  proc sql noprint;
    select distinct &variable. into :mvals separated by '|'
    from &indata.;

    %let mdim=&sqlobs;
  quit;

  data &indata.;
    set &indata.;
    %do _i=1 %to &mdim.;
      %let _v = %scan(&mvals., &_i., |);
      if VType(&variable)='C' then do;
         if &variable. = "&_v." then &variable.&_v. = 1;
         else &variable.&_v = 0;
      end;
      Else do;
         if &variable. = &_v. then &variable.&_v. = 1;
         else &variable.&_v = 0;
      end;
    %end;
  run;
%mend;

View solution in original post

4 REPLIES 4
ballardw
Super User

For this approach to work you need to make sure that the length of text resulting from &variable.&_v.

never exceeds 32 characters. It also may not contain spaces or charcters other than letters, digits and _.

The function NVALID  would allow you validate whether &variable.&_v creates a valid name.

You may also run into problems creating a variable name that already exists in your dataset.

 

You can use the VTYPE function to execute code conditionally. You have an issue with how character and numeric variables would behave.

 

I am very leery of using:

Data &indata;

   Set &indata;

construct in this context as there is a potential for destroying existing data if the &variable.&_v duplicates an existing variable.

 

This code addresses the Character/Numeric issue.

%macro cat(indata, variable);
  proc sql noprint;
    select distinct &variable. into :mvals separated by '|'
    from &indata.;

    %let mdim=&sqlobs;
  quit;

  data &indata.;
    set &indata.;
    %do _i=1 %to &mdim.;
      %let _v = %scan(&mvals., &_i., |);
      if VType(&variable)='C' then do;
         if &variable. = "&_v." then &variable.&_v. = 1;
         else &variable.&_v = 0;
      end;
      Else do;
         if &variable. = &_v. then &variable.&_v. = 1;
         else &variable.&_v = 0;
      end;
    %end;
  run;
%mend;
SmcGarrett
Obsidian | Level 7

THANK YOU THANK YOU THANK YOU!

LinusH
Tourmaline | Level 20

SInce you didin't posted any executable indata code, I'll just to paper examine your code.

What do you mean by numberical problem?

 

From what I see in the data step, I think you shouldn't use &Variable. in the assignments, just &_v, no?

 

Whenever your are trouble shooting macros, examine the logs after setting symbolgen, mprint ant potentially mlogic.

 

I know this is a foo example to simplify your post, but the data structure in your output makes little sense. I guess that you have a proper id variable somewhere. Either way, why keeping the original category var? I guess is a way to transpose your data for sttaistical/mining like analysis - but then you usually condense no of rows, which you can't if you keep your category var.

Data never sleeps
data_null__
Jade | Level 19

SAS has tools to make this easy.

 

proc transreg design data=sashelp.class;
   model class(name / zero=none cprefix=0);
   output out=design;
   run;

hackathon24-white-horiz.png

The 2025 SAS Hackathon has begun!

It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.

Latest Updates

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 4 replies
  • 4157 views
  • 4 likes
  • 4 in conversation