As Ahmed mentioned, DOSUBL can be used for code generation for a problem like this, but there are limitations.
As explained by others, when you use CALL EXECUTE to generate code, your main DATA step generates the code, and the generated code is executed *after* the main step has completed. This is why the timing doesn't work out in your attempt.
DOSUBL provides some *magical* functionality. When you use DOSUBL to generate code, your main DATA step can pause during execution, then the generated code is executed in a separate magical side-session, then your main DATA step resumes. On each iteration of your main DATA step, you can pause and execute a separate DATA step, and return the results to the main DATA step.
Take your original macro, and turn it into a function-style macro, which returns the calculated mean:
%macro calculate_avg(data=,var=,threshold=);
%local rc mean;
%let rc = %sysfunc(dosubl(%nrstr(
data _temp;
set &data(keep=&var);
where &var > &threshold;
excess = &var - &threshold;
run;
proc sql noprint;
select avg(excess)
into :mean trimmed
from _temp;
quit;
)));
&mean /*return*/
%mend calculate_avg;
You can use that macro like a function, so you could test with stuff like:
%put >>%calculate_avg(data=sashelp.cars,var=mpg_city,threshold=10)<< ;
%put >>%calculate_avg(data=sashelp.class,var=height,threshold=65.3)<<
If you want to use that macro like CALL EXECUTE, you can use the RESOLVE function to invoke the macro once on each iteration of the DATA step, like:
data work.results;
set work.percentiles;
Mean_Excess_Loss =input(
resolve('%calculate_avg('
|| ' data= sashelp.cars'
|| ',var= mpg_city '
|| ',threshold= ' || put(percentile_threshold,8.2)
|| ')'
)
,8.
);
run;
You can see how it looks similar to using CALL EXECUTE. But the timing works out, because the code is executed in a side session. That said, there are important limitations to this approach.
One is that your moving non-integer values between data step variables and macro variables, and whenever you do that you need to think carefully about how much precision you need.
The bigger limitation is that the price you pay for the magic of DOSUBL is that it is S-L-O-W. When I made a percentiles dataset with 100 rows, the SQL step completed in 0 seconds. The DOSUBL step took 15 seconds.
It makes sense that DOSUBL is slow. If you call it 100 times it needs to create and destroy a new side-session 100 times. And in each side session it executes a DATA step and a SQL step.
I changed your macro to a function-style macro, because I tend to use DOSUBL in the macro language setting. But that is not essential. You could keep your macro almost like it is, and just add a %GLOBAL statement so that it will generate a global macro variable:
%macro calculate_avg(data=,var=,threshold=);
%global mean ;
data _temp;
set &data(keep=&var);
where &var > &threshold;
excess = &var - &threshold;
run;
proc sql noprint;
select avg(excess)
into :mean trimmed
from _temp;
quit;
%mend calculate_avg;
Then you use DOSUBL function like:
data work.results;
set work.percentiles;
rc=dosubl('%calculate_avg('
|| ' data= sashelp.cars'
|| ',var= mpg_city '
|| ',threshold= ' || put(percentile_threshold,8.2)
|| ')'
);
Mean_Excess_Loss=symgetn('mean') ;
run;
%symdel mean ;
With that approach on each iteration of the DATA step loop, %calculate_avg() will be invoked and will create a global macro variable named MEAN. Then if you want to use that macro variable in the same step, you can use the SYMGETN function to retrieve the value.
... View more