DATA Step, Macro, Functions and more

recode a variable within a data step using macro variable conditions

Reply
Frequent Contributor
Frequent Contributor
Posts: 133

recode a variable within a data step using macro variable conditions

How can I combine my macro variable conditions to recode a variable in a data step? I try not to use macro variables when part of the data step but even when I change the code below to %do %if/%then outside of the dataset I run into errors, plus I'd rather run one data step rather than a data step for each value of a. 

 

I have a  list of variable names in list a and a list of variable names in list b. I need to do a loop within a loop to check that the names in list a = names in list b. If they are equal then I want to recode the variable name in list a to equal the element in a different list, b2 that is associated with list b. 

 

 

%let a = gender, age  ;  /*(in this example, yes, commas)*/

%let a_n = 2 ;

%let b= age height weight gender ;

%let b_n = 4;

%let b2= "unk1" "unk2" "unk3"  "unk4";  /*(in my example they are quoted)*/

 

Data want ;

Data have ;

do x = 1 to &a_n. ;

  do y=1 to &b_n. ;

      if scan(&a, &x, ',') = scan(&b., &y.) then do ;

        if scan(&a,&x,' ') = '' then scan(&a,&x)= scan(&b2,&y.) ;

        end;

     end ;

end ;

run;

 

I want the above to resolve to :

 

data want ;

data have ;

if age = '' then age="unk1" ;

if gender = "" then gender="unk4" ;

run;

 

Frequent Contributor
Frequent Contributor
Posts: 133

Re: recode a variable within a data step using macro variable conditions

and the entire block of code is within a macro
Super User
Posts: 17,868

Re: recode a variable within a data step using macro variable conditions

Can you use formats instead? If you have datasets with the data you can create the formats using a CNTLIN dataset and then you can dynamically build a FORMAT statement using SASHELP. VCOLUMNS.

 

You may want to use SASHELP anyways as it can tell you the variables in the data set and the formats your want to apply rather than the scan/check list process. 

Frequent Contributor
Frequent Contributor
Posts: 133

Re: recode a variable within a data step using macro variable conditions

This does not work either:

 

%macro test() ;

 

%let a = gender, age  ;  /*(in this example, yes, commas)*/

%let a_n = 2 ;

%let b= age height weight gender ;

%let b_n = 4;

%let b2= "unk1" "unk2" "unk3"  "unk4";  /*(in my example they are quoted)*/

 

Data want ;

Data have ;

array aa[&a_n.]  &a. ;

array bb[&b_n.]  &b2. ;

%do x = 1 %to &a_n. ;

  %do y=1 %to &b_n. ;

     %if %scan(&a, &x, ',') = %scan(&b., &y.) %then %do ;

        if aa[x] = '' then aa[x]=bb[y];

        %end;

     %end ;

%end ;

run;

 

%mend;

%test () ;

Super User
Posts: 5,085

Re: recode a variable within a data step using macro variable conditions

Yes, this can be done.

 

As usual, you need to be warned that you should not be doing this.  You need to learn more SAS language first.  Your question illustrates this in a couple of ways.  You have this statement in several places:

 

data have;

 

That should actually be a SET statement.  So even if you got macro language to work, you would still have debugging to do.  And it's much harder to debug a program when you have a combination of macro language errors and SAS language errors. 

 

Next, consider this statement:

 

if age='' then age="unk1";

 

This could only work properly if AGE is defined as a character variable, with a length of at least 4.  That would be a very unusual definition for a variable named AGE, but it is possible.

 

At any rate, here's a deal.  If you demonstrate sufficient macro language skills for this next task, I will finish the job later later today.  You have several %LET statements in your example.  Use those macro variables only, and use them inside a macro.  No DATA step code is necessary.  The macro code that you write must use the existing macro variables to generate the following macro statements:

 

%local age;

%let age = "unk1";

%local height;

%let height = "unk2";

%local weight;

%let weight = "unk3";

%local gender;

%let gender = "unk4";

 

If you can program that, I will program the rest of it.

Frequent Contributor
Frequent Contributor
Posts: 133

Re: recode a variable within a data step using macro variable conditions

 

 

data have;   was a typo. 

 

if age='' then age="unk1";   - yes age is character field (age groups in alpha labels - not my code, part of a bigger project) ;

 

 

Super User
Posts: 5,085

Re: recode a variable within a data step using macro variable conditions

Great.  So far so good.

Frequent Contributor
Frequent Contributor
Posts: 133

Re: recode a variable within a data step using macro variable conditions

I have multiple datasets I am looping through with various assignments for my variable lists. The number of variables and the variable names will differ by dataset and by group variable (inner loop) - so many combinations of variable lists. It's important I, at least try, to stick to a list structure rather than name individual variables to a %let statement. My %let statements in my posts were to show you the macro variables I am dealing with as they relate to the code I am having trouble with.The actual lists were generated from proc sql into: or call symputs within the loops. I'll be more careful next time to explain the issues better. Thanks anyway.  I figured it out.

Super User
Posts: 17,868

Re: recode a variable within a data step using macro variable conditions

If your question is answered, please mark your answer as the solution and close the thread.

Super User
Posts: 17,868

Re: recode a variable within a data step using macro variable conditions

If you don't need to the actual conversion, formats will be very efficient. Otherwise, even the ability to use PUTC/PUTN in combination of the format name would help. And formats are one of the fastest ways to recode data. 

PROC Star
Posts: 172

Re: recode a variable within a data step using macro variable conditions

Hi , Sorry for jumping in a bit too late. I have a feeling, you could make the solution less tedious and more effective by using Hashes or formats( as @Reeza )suggested. I hope your solution is working, nevertheless the logic can be hard on the macro processor and the storage of large sets of variable lists in macro vars. Precisely, that invokes heavy macro processor work even before the datastep compiler is involved. 

 

I haven't been active here lately however if I find time after my Final exams, I will try out the Hash solution and post it here. Anyways, if i missed out a point or two to read between the lines, kindly ignore my post. My apologies.

 

Regards,

Naveen Srinivasan

Frequent Contributor
Frequent Contributor
Posts: 133

Re: recode a variable within a data step using macro variable conditions

Thank you for your suggestions. I'm using the list variables in different ways so I'm not sure formats is the answer. I am familiar with hash and have used them in other projects as set table lookups but for this project proc sqls with dyanmically changing group by statements is all I need. Plus, it forces me to learn the macro facility!! I'm all set though so I don't want to take up your time. Good luck on your exams!

Respected Advisor
Posts: 3,896

Re: recode a variable within a data step using macro variable conditions

[ Edited ]

@CP2

Plus, it forces me to learn the macro facility! 

Sure thing! Use your chance to dig deeper into the SAS language. But don't use language structures only because it's fun for you (unless it's purely "academic").

 

As for macro code:

Very powerful but only use it if you "must" as it makes your code easily "messy" and hard to read and maintain.

 

May be some code like below is what you're after?

data recode_list;
  infile datalines dsd ;
  input _varname:$32. _val_orig:$20. _val_recode:$20.;
cards;
SEX,'F','X'
NAME,'ALFRED','Alf'
NAME,'WILLIAM','Wil'
NAME,'THOMAS','Tom'
;
run;

%macro recode(list);
  array char_vars {*} _character_;
  drop _:;
  if _n_=1 then 
    do;
      if 0 then set &list;
      dcl hash h1(dataset:"&list");
      h1.defineKey('_varname','_val_orig');
      h1.defineData('_val_recode');
      h1.defineDone();
    end;

  do _i=1 to dim(char_vars);
    if h1.find(key:upcase(vname(char_vars[_i])), key:upcase(char_vars[_i]))=0 then
      do;
        char_vars[_i]=strip(_val_recode);
      end;
  end;
%mend;

data test1;
  set sashelp.class;
  %recode(recode_list)
run;

data test2;
  set sashelp.classfit;
  %recode(recode_list)
run;

 

Respected Advisor
Posts: 3,896

Re: recode a variable within a data step using macro variable conditions

@CP2

If using a macro for this then you've got so many different parameters to set that things become cumbersome.

 

I'd go for a macro free approach. It doesn't hurt to have a few data steps with similar code blocks as this will still be much easier to read and maintain than implementing this via macro code.

Super User
Super User
Posts: 6,502

Re: recode a variable within a data step using macro variable conditions

This type of problem is MUCH easier to do without macro code.  

Your "A" list is not used in your example of the generated code.  Perhaps you intended it to loop over a list of dataset? But for now let's just ignore it.  So your input metadata probably should look something like this.

data recodes;
  length varname $32 missing $32 ;
cards;
age unk1
height unk2
weight unk3
gender unk4
;

Then perhaps you have some list of tables that you want to transform

%let dsnlist=demo labs;

And you have some metadata about what variables on in which datasets.  So you could combine them and generate a list of variables that exist and need the transformation code written for them.  So the result might be a table like this.

data ds_recodes;
  length member varname $32 missing $32 ;
  input member varname missing;
cards;
demog age unk1
demog height unk2
demog weight unk3
demog gender unk4
labs age unk1
labs height unk2
labs weight unk3
labs gender unk4
;

Now you can use that metadata table to generate the SAS program code you need to recode the missing values of the variables.

filename code temp;
data _null_;
  set ds_recodes ;
  by member ;
  file code ;
  if first.member then put
/ 'data ' member ';'
/ '  set ' member ';'
  ;
  put 
  '  if missing(' varname ') then ' varname '=' missing :$quote. ';'
  ;
  if last.member then put
  'run;'
  ;
run;

Which for the data file above will generate code like this

data demog ;
  set demog ;
  if missing(age ) then age ="unk1" ;
  if missing(height ) then height ="unk2" ;
  if missing(weight ) then weight ="unk3" ;
  if missing(gender ) then gender ="unk4" ;
run;

data labs ;
  set labs ;
  if missing(age ) then age ="unk1" ;
  if missing(height ) then height ="unk2" ;
  if missing(weight ) then weight ="unk3" ;
  if missing(gender ) then gender ="unk4" ;
run;

Which you can then execute by using a %INCLUDE statement.

%include code / source2 ;

 

 

Ask a Question
Discussion stats
  • 15 replies
  • 257 views
  • 4 likes
  • 6 in conversation