BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
pvareschi
Quartz | Level 8

Re: Predictive Modeling Using Logistic Regression

Given that the parameter estimates with FAST Backward selection are only approximations of the regression coefficients, when would it be appropriate to use this method (see page 3-70 of course text)?

Is the idea of using option FAST to just identify the most important predictors and then re-fit the model based on those inputs to get accurate estimates?

1 ACCEPTED SOLUTION

Accepted Solutions
sasmlp
SAS Employee

The FAST option in the MODEL statement is used mainly for speed. If you have a large number of predictors with a large sample size, the CPU time might be excessive if you use the Backwards elimination method without the FAST option. There is no need to refit the final model obtained by the FAST option. If the variable selection methods all obtain the same model as the final model, the parameter estimates will be the same for that final model. The question is did the FAST option eliminate any variables that made it to the final model in other variable selection methods. I personally like the best subset selection method in which you get the best subset of variables for a given number of variables (best 1 variable model, best 2 variable model, etc.). Then you can use goodness-of-fit statistics to decide which model to move forward with. 

View solution in original post

3 REPLIES 3
PaigeMiller
Diamond | Level 26

I don't know the answer to your question. I do have a contrary opinion however.

 

Methods like Stepwise, Forward and Backward selection are, in my opinion, train-wrecks waiting to happen. And its not just my opinion, if you search the internet for "problems with stepwise regression", you'll find plenty of people writing about this (here is an example). The problem is not that these selection methods never work; of course they do work sometimes. The problem is that you probably can't tell if they have given you a reasonable and effective answer, or if there are much better answers out there that the selection methods have missed. You just can't tell. And all of this stems from the problem of high correlation between your inputs, leading to high variances of the coefficients, leading to incorrect selection/deletion of terms from a model, and leading to  potentially misleading and hard to interpret coefficients.

--
Paige Miller
pvareschi
Quartz | Level 8

Thank you for sharing your thoughts; I really appreciate your answers as they provide plenty of insight and food-for-thought (so to speak)!

In particular, I have never thought and fully appreciated the practical issues arising from collinear inputs and how that may also affect the inputs selection and model fitting process.

sasmlp
SAS Employee

The FAST option in the MODEL statement is used mainly for speed. If you have a large number of predictors with a large sample size, the CPU time might be excessive if you use the Backwards elimination method without the FAST option. There is no need to refit the final model obtained by the FAST option. If the variable selection methods all obtain the same model as the final model, the parameter estimates will be the same for that final model. The question is did the FAST option eliminate any variables that made it to the final model in other variable selection methods. I personally like the best subset selection method in which you get the best subset of variables for a given number of variables (best 1 variable model, best 2 variable model, etc.). Then you can use goodness-of-fit statistics to decide which model to move forward with. 

 

This is a knowledge-sharing community for learners in the Academy. Find answers to your questions or post here for a reply.
To ensure your success, use these getting-started resources:

Estimating Your Study Time
Reserving Software Lab Time
Most Commonly Asked Questions
Troubleshooting Your SAS-Hadoop Training Environment

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 3 replies
  • 1158 views
  • 1 like
  • 3 in conversation