BookmarkSubscribeRSS Feed
Linda_CBE
Calcite | Level 5

We are building a logistic model and are having issues with the probabilities being very small. Two different models with the same data (138,000 obs): Model 1 with 3 variables and 3 interactions (one variable to the 2nd, 3rd and 4th power) the intercept is -460 and the Hosmer-Lemeshow p-value is .0001, the range of the probabilities is .034 to .80: Model 2 with 16 variables (11 are a date used in the class statement) and 3 interactions (one variable to the 2nd, 3rd and 4th power), the intercept is -709 and the Hosmer-Lemeshow p-value is .5865 and the range of probabilities is 4.12E-67 to 9.47E-17.  We ran a correlation between the probabilities and the response, in Model 1 it is as you would expect, a positive correlation, in Model 2 the correlation is negative.

We have done many iterations of the two models and this is the best we can get. We would like to use Model 2 but are concerned with the probabilities being so low. Why are the probabilities so low, why are they negatively correlated with the response and what can we do to fix it?

1 REPLY 1
DLing
Obsidian | Level 7

Sounds like the second model is a total failure.  The predicted probabilities are essentially all 0.  There's been a few posting recently on modeling rare events by Ruth.  Have a look at those to see if they give you some technical ideas.

SAS Innovate 2025: Call for Content

Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!

Submit your idea!

What is ANOVA?

ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 1 reply
  • 1436 views
  • 0 likes
  • 2 in conversation