BookmarkSubscribeRSS Feed
Shivi82
Quartz | Level 8
Hi,
I am building a multiple logistic regression model in sas. The model is significant after consulting the concordant and C statisitc value. Also the other statisitc such as discordant, Somer's D, multicollinearity, AIC are under the allowed limits.
The residuals also meet the assumptions of the model. However I have a question - do i still need to use the model selection techniques - forward, backward or stepwise regression. What i have learnt so far from reading literature is that these techniques could slow down the modeling process.
Could you please advice under what circumstances it is best to use these selection techniques and should there be a minimum number of independent variables while doing so.
Thanks you. Shivi
1 REPLY 1
Haris
Lapis Lazuli | Level 10

The answers to your questions will depend on your research context, sample size, event rate, the number of predictors, and their redundancy.

Very generally speaking, you want a model with as good of a fit to the data with as few variables as possible.  All of the variable selection techniques available in LOGISTIC are flawed and may not select the best variable subset.  They tend to over-include the predictors which may or may not be good for you.  The more modern ones such as LASSO and LAR are not available in PROC LOGISTIC at the moment.  Cross-validation may also be something you want to look into.

hackathon24-white-horiz.png

The 2025 SAS Hackathon has begun!

It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.

Latest Updates

What is ANOVA?

ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 1 reply
  • 1611 views
  • 0 likes
  • 2 in conversation