DATA Step, Macro, Functions and more

Macro to iterate over an array of variables

Reply
Contributor
Posts: 40

Macro to iterate over an array of variables

 

I want to run a lengthy code over a set of variables.

 

The code needs to fetch two variables and use them in later calculations. I have quite a lot of variables. 

 

I want the macro to ;

 

Example variables DEV_cash_liab_missing cash_liab_missing. They are always the same name distinguished only by the prefix DEV_

 

 

PSI_cash_liab_missing = ABS((DEV_cash_liab_missing-cash_liab_missing)*log(DEV_cash_liab_missing/cash_liab_missing))/100;

Format ALERT_cash_liab_missing $25.;
ALERT_cash_liab_missing = 'Normal PSI';
If PSI_cash_liab_missing>0.05 Then ALERT_cash_liab_missing = 'OBS - Avvik fra PSI oppdaget';
If PSI_cash_liab_missing>0.1 Then ALERT_cash_liab_missing ='OBS! - Høy PSI oppdaget';
If PSI_cash_liab_missing>0.2 Then ALERT_cash_liab_missing ='OBS!! - Svært høy PSI oppdaget';
If PSI_cash_liab_missing >0.3 Then ALERT_cash_liab_missing ='OBS!!! - Ekstermt høy PSI oppdaget';
IF PSI_cash_liab_missing =. Then ALERT_cash_liab_missing ='OBS - ingen observasjoner';

 

 

At the end I need only the PSI_cash_liab_missing and the ALERT_cash_liab_missing

Super User
Super User
Posts: 7,977

Re: Macro to iterate over an array of variables

Let me start by saying you do not need macro, same as always, macro is not needed.  

You have already answered your own question, arrays are for processing lots of the same variables.  As you have not provided any test data (in the form of a datastep) or what the output should look like, I can't say specifically for your situation, but if I had:

ID  VAR1  VAR2 DT1 DT2

I could work with these variables by using arrays:

data want;
  set have;
  array vars{*} var1 var2;  /* To show one way of specifying variables */
  array dt{2};                  /* To show using all dt variables */
  array vals{*} dt:;           /* To show with prefix */
  do i=1 to dim(vars{*});
    /* something here to do on all elements of vars array */
  end;
run;
Super User
Posts: 5,516

Re: Macro to iterate over an array of variables

Here's a very basic example of what a macro would look like for a single variable:

 

%macro calc (varname);

 

PSI_&varname = ABS((DEV_&varname - &varname)*log(DEV_&varname / &varname))/100;

Format ALERT_&varname $25.;
ALERT_&varname = 'Normal PSI';
If PSI_&varname >0.05 Then ALERT_&varname = 'OBS - Avvik fra PSI oppdaget';
If PSI_&varname >0.1 Then ALERT_&varname ='OBS! - Høy PSI oppdaget';
If PSI_&varname >0.2 Then ALERT_&varname ='OBS!! - Svært høy PSI oppdaget';
If PSI_&varname >0.3 Then ALERT_&varname ='OBS!!! - Ekstermt høy PSI oppdaget';
IF PSI_&varname =. Then ALERT_&varname ='OBS - ingen observasjoner';

keep PSI_&varname ALERT_&varname;

 

%mend calc;

 

Inside a DATA step, you would call the macro in this fashion:

 

%calc (cash_liab_missing)

 

If you find this is working, there are a number of issues to consider next:

 

  • How to rewrite the IF/THEN statements so they run faster
  • How to modify the DATA step so it keeps additional variables
  • How to write a second macro that will call this macro for a set of variables instead of just one variable

But it all starts with having a macro that works for one variable.  So this macro would be the starting point.

Super User
Posts: 19,851

Re: Macro to iterate over an array of variables

Also, look into formats for recording variables. It's a quick and easy way IMO. 

PROC Star
Posts: 102

Re: Macro to iterate over an array of variables

Kiteulf,

The macro example supplied by Astounding is a good starting point. Except for a small glitch: rather than using a KEEP statement to keep the output variables, I would use a DROP statement to drop the input variables. That way, you can get away with using multiple macro calls in the same data step.

 

I would write the macro something like this (not tested):

 

 

%macro calc (varname);
do;
  PSI _&varname = ABS((DEV_&varname - &varname)*log(DEV_&varname / &varname))/100;
  length ALERT_&varname $25;
  select;
when(missing(PSI_&varname)) ALERT_&varname ='OBS - ingen observasjoner';
when(PSI_&varname <= 0.05) ALERT_&varname = 'Normal PSI';
when(PSI_&varname <= 0.1) ALERT_&varname = 'OBS - Avvik fra PSI oppdaget';
when(PSI_&varname <= 0.2) ALERT_&varname ='OBS! - Høy PSI oppdaget';
when(PSI_&varname <= 0.3) ALERT_&varname ='OBS!! - Svært høy PSI oppdaget';
otherwise ALERT_&varname ='OBS!!! - Ekstermt høy PSI oppdaget';
end; drop DEV_&varname &varname; end %mend calc;

I changed the IF statements to a SELECT, in order to make the code run faster. The way your code was written, a high value would first be assigned the default, then it would be rewritten each time the PSI value was shown to be higher. And a normal value would only be written once, but it would be tested in every IF statement. Instead, with the SELECT statement, the output variable only gets written once, and the PSI is tested as few times as possible. To use the SELECT statement like this, it is important to remember that the order of the statements is crucially important - the smallest value (missing) first, and then compare  to increasing limits. You could also write it with descending values (using ">" instead of "<=", and the missing case in the OTHERWISE), but I assumed that the normal (small) values would be most frequent, meaning that the code would run faster that way.

 

Instead of using KEEP (when you use KEEP several times, with different variables, what is actually going to be kept?), I used DROP to drop the input variables, which will work even if you call the macro many times in the same datastep.

 

 

Note that I wrapped the code in the macro in DO; - END, and did not put a semicolon after the final END. This style of coding means that the macro can (and should) be used just like a normal SAS statement, and you can write code like

data X;
  set Y;
  if p>4 then
   %calc(a);
  else
   %calc(b);
run;

Regards,

Søren

 

Ask a Question
Discussion stats
  • 4 replies
  • 127 views
  • 1 like
  • 5 in conversation