@sjb1 I believe the difference in runtime you observe is solely due to the SAS datastep code your macro generates.
To easily see the macro generated code the SAS compiler will get:
filename mprint temp;
options fullstimer mprint mfile;
%macro_vs_datastep();
data _null_;
infile mprint;
input;
put _infile_;
run;
filename mprint clear;
With 5 loops (%let months_to_process = 5;) the code the macro generates is as below:
data input_dta;
length Date t 8;
do i = 1 to 100;
Date = intnx("MONTH", "01JAN1980"d, i, "E");
do t = 1 to 10000;
output;
end;
end;
run;
data data_step;
set input_dta;
if Date = intnx("MONTH", "01JAN1980"d, 1, "E") then output;
if Date = intnx("MONTH", "01JAN1980"d, 2, "E") then output;
if Date = intnx("MONTH", "01JAN1980"d, 3, "E") then output;
if Date = intnx("MONTH", "01JAN1980"d, 4, "E") then output;
if Date = intnx("MONTH", "01JAN1980"d, 5, "E") then output;
run;
data macro;
set input_dta;
if Date = 7364 then output;
if Date = 7395 then output;
if Date = 7425 then output;
if Date = 7456 then output;
if Date = 7486 then output;
run;
The step "data_step" needs to execute the intnx() function once per loop while step "macro" only needs to do a simple comparison per loop. I believe that's the reason for the performance difference you observe.
As you mentioned already it does look like the intnx() function needs to execute for every single iteration of the data step and doesn't already resolve to a SAS date value during compilation time.
I guess if you would generate below data step code then the intnx() function would also only execute once.
if Date = %sysfunc(intnx(MONTH, "01JAN1980"d, 1, E)) then output;
...and of course both code versions would profit from change for a real implementation.
... View more