Hi guys,
I'm trying to run a macro on a set of columns (variables) of a dataset. The macro is the following:
%macro glm(dataset, y, x);
ods select OverallANOVA FitStatistics ParameterEstimates;
ods table ParameterEstimates=param_&y FitStatistics=fit_&y OverallANOVA=ANOVA_&y;
proc glm data=&dataset;
class count;
model &y. = &x. /solution;
quit;
%mend;
%glm(TSM_Mods, n_Mod1_PS, count);
The macro now runs on the column n_Mod1_PS but it has to run on n_Mod2_PS, ...., n_Mod99_PS then on n_Mod1_CV, n_Mod2_CV, ...., n_Mod99_CV then on n_Mod1_INF, n_Mod2_INF, ...., n_Mod99_INF. How can I modify %glm(TSM_Mods,..., count) line to run the macro on all the variables I listed here that are present in the dataset TSM_Mods?
Thank you in advance
Something like this should do the job:
data SetWithVariablesNames;
input variableName $ 32.;
cards;
n_Mod1_PS
n_Mod2_PS
...
n_Mod99_PS
n_Mod1_CV
n_Mod2_CV
...
n_Mod99_CV
n_Mod1_INF
n_Mod2_INF
...
n_Mod99_INF
;
run;
data _null_;
set SetWithVariablesNames;
call execute('%nrstr(%glm(TSM_Mods, ' !! variableName !! ', count))');
run;
Assuming the data sets with your variables exists you can use the following PROC TRANSPOSE trick to get list of selected variables:
proc transpose
data = TSM_Mods(obs=0 keep=n_Mod:) /* <--- select variables you need */
out = SetWithVariablesNames(rename=(_name_=variableName));
var _all_;
run;
Bart
Something like this should do the job:
data SetWithVariablesNames;
input variableName $ 32.;
cards;
n_Mod1_PS
n_Mod2_PS
...
n_Mod99_PS
n_Mod1_CV
n_Mod2_CV
...
n_Mod99_CV
n_Mod1_INF
n_Mod2_INF
...
n_Mod99_INF
;
run;
data _null_;
set SetWithVariablesNames;
call execute('%nrstr(%glm(TSM_Mods, ' !! variableName !! ', count))');
run;
Assuming the data sets with your variables exists you can use the following PROC TRANSPOSE trick to get list of selected variables:
proc transpose
data = TSM_Mods(obs=0 keep=n_Mod:) /* <--- select variables you need */
out = SetWithVariablesNames(rename=(_name_=variableName));
var _all_;
run;
Bart
To answer your specific question about how to use macros for this problem, see here:
https://communities.sas.com/t5/New-SAS-User/Run-a-macro-on-many-files/m-p/934673#M42004
However, @Rick_SAS shows that running many regressions does not require a macro at all
Which brings up a bigger point: what are you going to do with all of these regressions once you run them? Have you even thought about that? I assume you have, but really, running hundreds of regressions is a poor approach. Running something like a Partial Least Squares (PROC PLS) model, with all the variables in the model at once, seems like it might be a better approach, and its a hell of a lot faster and much much easier to program. Please see this paper by Randy Tobias of SAS Institute about PLS, where he fits a PLS model to 1,000 X variables (all in the model at once) and gets a useful result. Also see this paper about PLS by Bartell (who works for JMP).
Nearly 200 sessions are now available on demand in the Innovate Hub.
Watch Now →SAS' Charu Shankar shares her PROC SQL expertise by showing you how to master the WHERE clause using real winter weather data.
Find more tutorials on the SAS Users YouTube channel.