turn on suggestions

Auto-suggest helps you quickly narrow down your search results by suggesting possible matches as you type.

Showing results for

Find a Community

- Home
- /
- Analytics
- /
- Stat Procs
- /
- Logistic Regression very low probabilities

Topic Options

- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

07-12-2011 10:51 AM

We are building a logistic model and are having issues with the probabilities being very small. Two different models with the same data (138,000 obs): Model 1 with 3 variables and 3 interactions (one variable to the 2nd, 3rd and 4th power) the intercept is -460 and the Hosmer-Lemeshow p-value is .0001, the range of the probabilities is .034 to .80: Model 2 with 16 variables (11 are a date used in the class statement) and 3 interactions (one variable to the 2nd, 3rd and 4th power), the intercept is -709 and the Hosmer-Lemeshow p-value is .5865 and the range of probabilities is 4.12E-67 to 9.47E-17. We ran a correlation between the probabilities and the response, in Model 1 it is as you would expect, a positive correlation, in Model 2 the correlation is negative.

We have done many iterations of the two models and this is the best we can get. We would like to use Model 2 but are concerned with the probabilities being so low. Why are the probabilities so low, why are they negatively correlated with the response and what can we do to fix it?

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

07-12-2011 02:03 PM

Sounds like the second model is a total failure. The predicted probabilities are essentially all 0. There's been a few posting recently on modeling rare events by Ruth. Have a look at those to see if they give you some technical ideas.