BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
ImSpartacus
Fluorite | Level 6

I am attempting to modify a macro shared here. I wanted to concatenate a brief string to the front of each indicator variable's header to mark that it's an indicator and what kind of indicator it is. 

 

Below is some test data and the modified macro. It appears to work great on character categorical variables such as "Category1" below, but it doesn't work properly on numerical categorical variables such as "Category2" below (all of the indicators have values "1" for all records). Any tips/ideas are appreciated. 

 

 

* Import example dummy data ;

DATA TestData0;
	LENGTH Category1 $ 9;
	INPUT ID Category1 Category2;
	DATALINES;
1 CategoryA 1
2 CategoryB 1
3 CategoryA 1
4 CategoryA 2
5 CategoryC 3
6 CategoryB 2
;
RUN;

* Write macro to generate indicators from categorical variables ;

%macro cat(indata,outdata,variable,abbr);

*Scrub special characters and ensure the categorical variable entries are properly formatted;

DATA CatTest;
	SET &indata.;

	tempvar = "IND_"||"&abbr."||SUBSTR(compress(&variable.,,'kad'),1,MIN(33-LENGTH("IND_")-%LENGTH("&abbr."),LENGTH(compress(&variable.,,'kad')))) ;

RUN;

* Actually generate the indicators ;

  proc sql noprint;
   select distinct tempvar 
    into :mvals separated by '|'
    from CatTest;
    %let mdim=&sqlobs;
  quit;

  * %PUT &mvals; * Log variable string for debugging ;

  data &outdata.(DROP=tempvar);
    set CatTest;
    %do _i=1 %to &mdim.;
      %let _v = %scan(&mvals., &_i., |);
      if VType(&variable)='C' then do;
         if tempvar = "&_v." then &_v. = 1;
         else &_v = 0;
      end;
      else do;
         if tempvar = &_v. then &_v. = 1;
         else &_v = 0;
            end;
    %end;
  run;

%mend;

* Run macro ;

%CAT(TestData0,TestData1,Category1,CAT1_);
%CAT(TestData1,TestData2,Category2,CAT2_);

Also, as I am still learning SAS, any general tips or suggestions on this macro are certainly welcome. 

 

Thanks! 

1 ACCEPTED SOLUTION

Accepted Solutions
Tom
Super User Tom
Super User

You probably do NOT want to be using macro code to do this problem.  But if you did then you need to make sure not to mix your macro code and the SAS code that it is generating.  For starters you should add a datastep to determine if the variable is numeric or character and then you can use that value with macro logic to generate different code for numeric an character input variables.

 

data _null_;
  set &indata ;
  call symputx('vtype',vtype(&variable));
  stop;
run;

So now later you could conditionally generate the code to generate the names for the dummy varaibles based on whether your input variable is character or numeric.

DATA CatTest;
  SET &indata.;
  length tempvar $32 ;
%if (&vtype=C) %then %do;
  tempvar = "IND_&abbr" || compress(&variable,,'kad');
%end;
%else %do;
  tempvar = "IND_&abbr" || compress(put(&variable,best%eval(32-%length(IND_&abbr)).),,'kad');
%end;
run;

 

Note that once you have generated the TEMPVAR you know it is character (since it is the name of the dummy varaible) so your last data step can be much simplier as it no longer depends on the type of the original variable. Also you can take advantage of the fact that SAS evaluates boolean expressions to 0 or 1 to indicate false or true, respectively. 

data &outdata (drop=tempvar);
  set CatTest;
%do _i=1 %to &mdim;
  %let _v = %scan(&mvals,&_i,|);
  &_v = (tempvar="&_v") ;
%end;
run;

 

View solution in original post

6 REPLIES 6
Astounding
PROC Star

Regardless of the solution, you can probably get along just fine without these variables.  For example:

 

  • Sort your data set by CATEGORY2, and run a procedure BY CATEGORY2.  Or,
  • Use CATEGORY2 in a CLASS statement (either to process each value separately, or to automatically create and utilize dummy variables such as you are trying to create ... the exact handling depends on the procedure that contains the CLASS statement).

 

Reeza
Super User

This may be helpful. Notice at the end there are links to several other methods. IMO I would recommend one of these approaches. Macro's are great, but save them for when absolutely necessary. A large proportion of problems can be solved in multiple ways.

Tom
Super User Tom
Super User

You probably do NOT want to be using macro code to do this problem.  But if you did then you need to make sure not to mix your macro code and the SAS code that it is generating.  For starters you should add a datastep to determine if the variable is numeric or character and then you can use that value with macro logic to generate different code for numeric an character input variables.

 

data _null_;
  set &indata ;
  call symputx('vtype',vtype(&variable));
  stop;
run;

So now later you could conditionally generate the code to generate the names for the dummy varaibles based on whether your input variable is character or numeric.

DATA CatTest;
  SET &indata.;
  length tempvar $32 ;
%if (&vtype=C) %then %do;
  tempvar = "IND_&abbr" || compress(&variable,,'kad');
%end;
%else %do;
  tempvar = "IND_&abbr" || compress(put(&variable,best%eval(32-%length(IND_&abbr)).),,'kad');
%end;
run;

 

Note that once you have generated the TEMPVAR you know it is character (since it is the name of the dummy varaible) so your last data step can be much simplier as it no longer depends on the type of the original variable. Also you can take advantage of the fact that SAS evaluates boolean expressions to 0 or 1 to indicate false or true, respectively. 

data &outdata (drop=tempvar);
  set CatTest;
%do _i=1 %to &mdim;
  %let _v = %scan(&mvals,&_i,|);
  &_v = (tempvar="&_v") ;
%end;
run;

 

ImSpartacus
Fluorite | Level 6
I greatly appreciate your feedback (haven't tested, but it appears that this is probably how I should've structured it from the start).

One quick question: Could you elaborate on the "you need to make sure not to mix your macro code and the SAS code that it is generating" comment?
Tom
Super User Tom
Super User

@ImSpartacus wrote:


One quick question: Could you elaborate on the "you need to make sure not to mix your macro code and the SAS code that it is generating" comment?

In this case this issue is that within a single data step a variable cannot both be character and numeric. Based on how you first reference it SAS will pick one when compiling the data step.  So to be able to create a data step that can handle the variable being either character or numeric then you need to use macro logic

%if &vtype=C %then %do;
 < code that treats X as character>
%end;
%else %do;
 < code that treats X as character>
%end;

and not data step logic.

if vtype(x)='C' then do;
 < code that treats X as character >
end;
else do;
 < code that treats X as numeric >
end;
ImSpartacus
Fluorite | Level 6
I appreciate the additional clarification.

I was aware that macro logic must be used in a macro and data step logic must be used in a data step, but I didn't know that the two wouldn't play nicely with each other.

Any recommended reading on the topic? I'm sure there are additional nuances that I've overlooked in my limited learning.

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 6 replies
  • 1471 views
  • 2 likes
  • 4 in conversation