Contributor
Posts: 22

# Best Fit Logistic Regression Model

Hello all!

I need to fit a logistic regression model and am wondering which model-seletion method would be best. I have been advised to stay away from forward/backward/stepwise regression. All-possible-regression seems attractive, but I must admit I'm a little lost on AIC/BIC/Cp/etc and exactly how I would go about picking the best model...

I have a binary response variable, a categorical predictor, 10 categorical covariates, and 2 continuous covariates.

Super User
Posts: 20,755

## Re: Best Fit Logistic Regression Model

Search Model Selection Method on here...this topic comes up frequently, and there is no 'CORRECT' answer, but some answers are more valid than others

Contributor
Posts: 22

## Re: Best Fit Logistic Regression Model

Unfortunately I've been all over the boards and haven't found anything useful. I've also read several papers - I just can't seem to locate the syntax for an all-possible. In addition, I was hoping someone could break it down for me in less technical language so I could really understand AIC/Cp/etc...

Contributor
Posts: 53

## Re: Best Fit Logistic Regression Model

After finding the potential factor/variable  for inclusion in the model using any of:

- selection = stepwise slentry = 0.15 slstay = 0.15;

- selection = forward  slentry =0.15

- selection = backward slstay = 0.15

- selection = score ,

for both quantitative and categorical variables and interaction term - you can compare models based on following criteria:

• -2LogL
• The value itself is not important. It is used to compare two nested models, model with smaller -2LogL is better. Difference in -2LogL between two nested models is approximately distributed as Chi-square.
• AIC (Akaike Information Criterion)
• AIC is used to compare non-nested models on the same sample. AIC value itself is not meaningful but the model with the smallest AIC is considered the best.
• SC (Schwarz Criterion)
• Model with smallest SC is most desirable but the value itself is not meaningful. Like AIC, it is appropriate for non-nested models.
• ROC Area
• The area under the ROC curve is a measure of the model’s ability to discriminate between event and non-event:
• Large values are desirable (predictive accuracy for (event, non-event) pairs).
• ROC = 0.5: no discrimination (no better than coin toss)
• 0.7 <= ROC < 0.8: acceptable discrimination
• 0.8 <= ROC < 0.9: excellent discrimination
• ROC > 0.9: outstanding discrimination
• Brier’s Score
• Small values are desirable.

Super Contributor
Posts: 301

## Re: Best Fit Logistic Regression Model

I think stepwise selection has better chance to give the model with the best fit (compared to forward / backward) . This is because this method can both go forward and backward until the model can not end up with a better fit. Backward selection goes only backward and forward go only forward.
Btw, there is also the LASSO method, which can be as good as stepwise selection.
Super User
Posts: 10,214