Programming the statistical procedures from SAS

Best Fit Logistic Regression Model

Posts: 22

Best Fit Logistic Regression Model

Hello all!


I need to fit a logistic regression model and am wondering which model-seletion method would be best. I have been advised to stay away from forward/backward/stepwise regression. All-possible-regression seems attractive, but I must admit I'm a little lost on AIC/BIC/Cp/etc and exactly how I would go about picking the best model...


I have a binary response variable, a categorical predictor, 10 categorical covariates, and 2 continuous covariates.


Thank you in advance!

Grand Advisor
Posts: 16,925

Re: Best Fit Logistic Regression Model

Search Model Selection Method on here...this topic comes up frequently, and there is no 'CORRECT' answer, but some answers are more valid than others Smiley Wink

Posts: 22

Re: Best Fit Logistic Regression Model

Unfortunately I've been all over the boards and haven't found anything useful. I've also read several papers - I just can't seem to locate the syntax for an all-possible. In addition, I was hoping someone could break it down for me in less technical language so I could really understand AIC/Cp/etc...

Posts: 53

Re: Best Fit Logistic Regression Model

Hi @chelsealutz



After finding the potential factor/variable  for inclusion in the model using any of:

- selection = stepwise slentry = 0.15 slstay = 0.15;

- selection = forward  slentry =0.15

- selection = backward slstay = 0.15

- selection = score ,

for both quantitative and categorical variables and interaction term - you can compare models based on following criteria: 


  • -2LogL
  • The value itself is not important. It is used to compare two nested models, model with smaller -2LogL is better. Difference in -2LogL between two nested models is approximately distributed as Chi-square.
  • AIC (Akaike Information Criterion)
  • AIC is used to compare non-nested models on the same sample. AIC value itself is not meaningful but the model with the smallest AIC is considered the best.
  • SC (Schwarz Criterion)
  • Model with smallest SC is most desirable but the value itself is not meaningful. Like AIC, it is appropriate for non-nested models.
  • ROC Area
    • The area under the ROC curve is a measure of the model’s ability to discriminate between event and non-event:
  • Large values are desirable (predictive accuracy for (event, non-event) pairs).
    • ROC = 0.5: no discrimination (no better than coin toss)
    • 0.7 <= ROC < 0.8: acceptable discrimination
    • 0.8 <= ROC < 0.9: excellent discrimination
    • ROC > 0.9: outstanding discrimination
  • Brier’s Score
  • Small values are desirable. 




Super Contributor
Posts: 271

Re: Best Fit Logistic Regression Model

I think stepwise selection has better chance to give the model with the best fit (compared to forward / backward) . This is because this method can both go forward and backward until the model can not end up with a better fit. Backward selection goes only backward and forward go only forward.
Btw, there is also the LASSO method, which can be as good as stepwise selection.
Grand Advisor
Posts: 9,463

Re: Best Fit Logistic Regression Model

You gotta know  forward/backward/stepwise regression all these are doing unconditional logistic regression.

After getting the most influent variables , to get Best Fit , you'd better try Exact logistic regression or Conditional  logistic regression or Penalty  logistic regression(add FIRTH option into ( MODEL statement ) .

Ask a Question
Discussion stats
  • 5 replies
  • 5 in conversation