I want to run a lengthy code over a set of variables.
The code needs to fetch two variables and use them in later calculations. I have quite a lot of variables.
I want the macro to ;
Example variables DEV_cash_liab_missing cash_liab_missing. They are always the same name distinguished only by the prefix DEV_
PSI_cash_liab_missing = ABS((DEV_cash_liab_missing-cash_liab_missing)*log(DEV_cash_liab_missing/cash_liab_missing))/100;
Format ALERT_cash_liab_missing $25.;
ALERT_cash_liab_missing = 'Normal PSI';
If PSI_cash_liab_missing>0.05 Then ALERT_cash_liab_missing = 'OBS - Avvik fra PSI oppdaget';
If PSI_cash_liab_missing>0.1 Then ALERT_cash_liab_missing ='OBS! - Høy PSI oppdaget';
If PSI_cash_liab_missing>0.2 Then ALERT_cash_liab_missing ='OBS!! - Svært høy PSI oppdaget';
If PSI_cash_liab_missing >0.3 Then ALERT_cash_liab_missing ='OBS!!! - Ekstermt høy PSI oppdaget';
IF PSI_cash_liab_missing =. Then ALERT_cash_liab_missing ='OBS - ingen observasjoner';
At the end I need only the PSI_cash_liab_missing and the ALERT_cash_liab_missing
Let me start by saying you do not need macro, same as always, macro is not needed.
You have already answered your own question, arrays are for processing lots of the same variables. As you have not provided any test data (in the form of a datastep) or what the output should look like, I can't say specifically for your situation, but if I had:
ID VAR1 VAR2 DT1 DT2
I could work with these variables by using arrays:
data want; set have; array vars{*} var1 var2; /* To show one way of specifying variables */ array dt{2}; /* To show using all dt variables */ array vals{*} dt:; /* To show with prefix */ do i=1 to dim(vars{*}); /* something here to do on all elements of vars array */ end; run;
Here's a very basic example of what a macro would look like for a single variable:
%macro calc (varname);
PSI_&varname = ABS((DEV_&varname - &varname)*log(
Format ALERT_&varname $25.;
ALERT_&varname = 'Normal PSI';
If PSI_&varname >0.05 Then ALERT_&varname = 'OBS - Avvik fra PSI oppdaget';
If PSI_&varname >0.1 Then ALERT_&varname ='OBS! - Høy PSI oppdaget';
If PSI_&varname >0.2 Then ALERT_&varname ='OBS!! - Svært høy PSI oppdaget';
If PSI_&varname >0.3 Then ALERT_&varname ='OBS!!! - Ekstermt høy PSI oppdaget';
IF PSI_&varname =. Then ALERT_&varname ='OBS - ingen observasjoner';
keep PSI_&varname ALERT_&varname;
%mend calc;
Inside a DATA step, you would call the macro in this fashion:
%calc (cash_liab_missing)
If you find this is working, there are a number of issues to consider next:
But it all starts with having a macro that works for one variable. So this macro would be the starting point.
Also, look into formats for recording variables. It's a quick and easy way IMO.
Kiteulf,
The macro example supplied by Astounding is a good starting point. Except for a small glitch: rather than using a KEEP statement to keep the output variables, I would use a DROP statement to drop the input variables. That way, you can get away with using multiple macro calls in the same data step.
I would write the macro something like this (not tested):
%macro calc (varname); do; PSI _&varname = ABS((DEV_&varname - &varname)*log(DEV_&varname / &varname))/100; length ALERT_&varname $25;
select;
when(missing(PSI_&varname)) ALERT_&varname ='OBS - ingen observasjoner';
when(PSI_&varname <= 0.05) ALERT_&varname = 'Normal PSI';
when(PSI_&varname <= 0.1) ALERT_&varname = 'OBS - Avvik fra PSI oppdaget';
when(PSI_&varname <= 0.2) ALERT_&varname ='OBS! - Høy PSI oppdaget';
when(PSI_&varname <= 0.3) ALERT_&varname ='OBS!! - Svært høy PSI oppdaget';
otherwise ALERT_&varname ='OBS!!! - Ekstermt høy PSI oppdaget';
end; drop DEV_&varname &varname; end %mend calc;
I changed the IF statements to a SELECT, in order to make the code run faster. The way your code was written, a high value would first be assigned the default, then it would be rewritten each time the PSI value was shown to be higher. And a normal value would only be written once, but it would be tested in every IF statement. Instead, with the SELECT statement, the output variable only gets written once, and the PSI is tested as few times as possible. To use the SELECT statement like this, it is important to remember that the order of the statements is crucially important - the smallest value (missing) first, and then compare to increasing limits. You could also write it with descending values (using ">" instead of "<=", and the missing case in the OTHERWISE), but I assumed that the normal (small) values would be most frequent, meaning that the code would run faster that way.
Instead of using KEEP (when you use KEEP several times, with different variables, what is actually going to be kept?), I used DROP to drop the input variables, which will work even if you call the macro many times in the same datastep.
Note that I wrapped the code in the macro in DO; - END, and did not put a semicolon after the final END. This style of coding means that the macro can (and should) be used just like a normal SAS statement, and you can write code like
data X; set Y; if p>4 then %calc(a); else %calc(b); run;
Regards,
Søren
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.