11-05-2014 07:23 AM
|I am building a multiple logistic regression model in sas. The model is significant after consulting the concordant and C statisitc value. Also the other statisitc such as discordant, Somer's D, multicollinearity, AIC are under the allowed limits.|
|The residuals also meet the assumptions of the model. However I have a question - do i still need to use the model selection techniques - forward, backward or stepwise regression. What i have learnt so far from reading literature is that these techniques could slow down the modeling process.|
|Could you please advice under what circumstances it is best to use these selection techniques and should there be a minimum number of independent variables while doing so.|
|Thanks you. Shivi|
11-13-2014 04:52 PM
The answers to your questions will depend on your research context, sample size, event rate, the number of predictors, and their redundancy.
Very generally speaking, you want a model with as good of a fit to the data with as few variables as possible. All of the variable selection techniques available in LOGISTIC are flawed and may not select the best variable subset. They tend to over-include the predictors which may or may not be good for you. The more modern ones such as LASSO and LAR are not available in PROC LOGISTIC at the moment. Cross-validation may also be something you want to look into.