Hello,
I am attempting to build a model with 7 predictors and a binary outcome. I can only use stepwise selection for my assignment. Here is what I currently have in SAS university. Now i know i need to account for possible interactions but how would I do this? Do I include all possible interaction terms in THIS model or do I code for additional models, each one containing a different set of variables/interactions of interest? Thanks in advance for any help!
proc logistic data=fp;
class race(ref='White') Sex(ref='F')/param=reference;
model mort_10yr(ref='0') = age sex race educ income_pov_indx log_cotinine log_tungsten FP FI
/selection=stepwise slentry=0.2 slstay=0.15 details lackfit;
run;
Since this is an assignment there are probably some other things that effect the answer. You have 7 variables and 2 way interactions alone are a lot.
You can specify interaction terms in the model statement as:
model mort_10yr(ref='0') = age | sex | race | educ @2 / <list of options>;
@the | pipe symbol tells SAS to consider interactions between the variables and then the @2 tells SAS to limit it to interaction level between 2 variables.
@3 would test 3-way interactions such as age*sex*race.
You can explicitly list interactions such as:
age*sex educ*race
This is covered in the documentation here:
You can include all variables in the initial model, if you have enough observations to allow for that. You're possibly testing 7 + 21 two way combinations and categorical variables will increase the numbers. A good rule of thumb is 25 obs per predictor so you would need at minimum 28*25 = 700 obs to start off with.
I'm going to move this to the Stats forum so the stats people can comment.
@iressa131 wrote:
Hello,
I am attempting to build a model with 7 predictors and a binary outcome. I can only use stepwise selection for my assignment. Here is what I currently have in SAS university. Now i know i need to account for possible interactions but how would I do this? Do I include all possible interaction terms in THIS model or do I code for additional models, each one containing a different set of variables/interactions of interest? Thanks in advance for any help!
proc logistic data=fp; class race(ref='White') Sex(ref='F')/param=reference; model mort_10yr(ref='0') = age sex race educ income_pov_indx log_cotinine log_tungsten FP FI /selection=stepwise slentry=0.2 slstay=0.15 details lackfit; run;
Since this is an assignment there are probably some other things that effect the answer. You have 7 variables and 2 way interactions alone are a lot.
You can specify interaction terms in the model statement as:
model mort_10yr(ref='0') = age | sex | race | educ @2 / <list of options>;
@the | pipe symbol tells SAS to consider interactions between the variables and then the @2 tells SAS to limit it to interaction level between 2 variables.
@3 would test 3-way interactions such as age*sex*race.
You can explicitly list interactions such as:
age*sex educ*race
This is covered in the documentation here:
You can include all variables in the initial model, if you have enough observations to allow for that. You're possibly testing 7 + 21 two way combinations and categorical variables will increase the numbers. A good rule of thumb is 25 obs per predictor so you would need at minimum 28*25 = 700 obs to start off with.
I'm going to move this to the Stats forum so the stats people can comment.
@iressa131 wrote:
Hello,
I am attempting to build a model with 7 predictors and a binary outcome. I can only use stepwise selection for my assignment. Here is what I currently have in SAS university. Now i know i need to account for possible interactions but how would I do this? Do I include all possible interaction terms in THIS model or do I code for additional models, each one containing a different set of variables/interactions of interest? Thanks in advance for any help!
proc logistic data=fp; class race(ref='White') Sex(ref='F')/param=reference; model mort_10yr(ref='0') = age sex race educ income_pov_indx log_cotinine log_tungsten FP FI /selection=stepwise slentry=0.2 slstay=0.15 details lackfit; run;
Thank you so much for the informative reply! I only have 520 observations so it appears I won't be able to use the method you suggested. I didn't realize there was a board for statistical questions thank you!
@iressa131 wrote:
Thank you so much for the informative reply! I only have 520 observations so it appears I won't be able to use the method you suggested. I didn't realize there was a board for statistical questions thank you!
It's not a can't, it's a shouldn't. Hopefully someone has better advice for you 🙂
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.
Find more tutorials on the SAS Users YouTube channel.