I am working on a model that performs regression analysis using the PROC GLMSELECT statement. In order to make my program as dynamic as possible I would like to assign all of the numeric variables from my source table to a macro variable that can be referenced in the model statement, thus allowing the name and number of variables in the input table to change without the need to change the regression code. I would also like to create a macro variable for the categorical variables to be referenced in the Class statement. Please help!
There are a few ways to get all variables. The one I would recommend for your purposes is PROC CONTENTS (followed by PROC SQL), because you might want to exclude the dependent from your list of variables. For example, you could code:
proc contents data=have (drop=&dependent) noprint out=_contents_ (keep=name type);
run;
That gives you a SAS data set with the name and type of all variables in your data set, excluding the dependent.
Then SQL takes over:
proc sql;
select strip(name) into : numvars separated by ' ' from _contents_
where type=1;
select strip(name) into : charvars separated by ' ' from _contents_
where type=2;
quit;
There are a few ways to get all variables. The one I would recommend for your purposes is PROC CONTENTS (followed by PROC SQL), because you might want to exclude the dependent from your list of variables. For example, you could code:
proc contents data=have (drop=&dependent) noprint out=_contents_ (keep=name type);
run;
That gives you a SAS data set with the name and type of all variables in your data set, excluding the dependent.
Then SQL takes over:
proc sql;
select strip(name) into : numvars separated by ' ' from _contents_
where type=1;
select strip(name) into : charvars separated by ' ' from _contents_
where type=2;
quit;
@tgrandchamp wrote:
I am working on a model that performs regression analysis using the PROC GLMSELECT statement. In order to make my program as dynamic as possible I would like to assign all of the numeric variables from my source table to a macro variable that can be referenced in the model statement, thus allowing the name and number of variables in the input table to change without the need to change the regression code. I would also like to create a macro variable for the categorical variables to be referenced in the Class statement. Please help!
Only you will know which variables are categorical unless you can absolutely positively never ever fails a variable is categorical if there are fewer than XX unique values. If that is the case then Proc freq with nlevels option may be helpful such as:
ods select freq.nlevels; proc freq data=sashelp.class nlevels; ods output nlevels = work.mylevels; run;
which will build a data set with variable and number of unique values, missing and levels of missing. The something like this,
Proc sql noprint; select tablevar into : catvars separated by ' ' from work.mylevels where Nlevels le 6; quit; %put &catvars;
works assuming the XX I mentioned above is 6.
There was a very nice article on the SAS Nordic Group by @OskarE on this very topic. Check it out
Automatic modeling with thousands/millions of inputs but only a few lines of code!
Join us for SAS Innovate 2025, our biggest and most exciting global event of the year, in Orlando, FL, from May 6-9. Sign up by March 14 for just $795.
ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.
Find more tutorials on the SAS Users YouTube channel.