BookmarkSubscribeRSS Feed
Calcite | Level 5

We are building a logistic model and are having issues with the probabilities being very small. Two different models with the same data (138,000 obs): Model 1 with 3 variables and 3 interactions (one variable to the 2nd, 3rd and 4th power) the intercept is -460 and the Hosmer-Lemeshow p-value is .0001, the range of the probabilities is .034 to .80: Model 2 with 16 variables (11 are a date used in the class statement) and 3 interactions (one variable to the 2nd, 3rd and 4th power), the intercept is -709 and the Hosmer-Lemeshow p-value is .5865 and the range of probabilities is 4.12E-67 to 9.47E-17.  We ran a correlation between the probabilities and the response, in Model 1 it is as you would expect, a positive correlation, in Model 2 the correlation is negative.

We have done many iterations of the two models and this is the best we can get. We would like to use Model 2 but are concerned with the probabilities being so low. Why are the probabilities so low, why are they negatively correlated with the response and what can we do to fix it?

Obsidian | Level 7

Sounds like the second model is a total failure.  The predicted probabilities are essentially all 0.  There's been a few posting recently on modeling rare events by Ruth.  Have a look at those to see if they give you some technical ideas.

Ready to join fellow brilliant minds for the SAS Hackathon?

Build your skills. Make connections. Enjoy creative freedom. Maybe change the world. Registration is now open through August 30th. Visit the SAS Hackathon homepage.

Register today!
What is ANOVA?

ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 1 reply
  • 2 in conversation