turn on suggestions

Auto-suggest helps you quickly narrow down your search results by suggesting possible matches as you type.

Showing results for

Find a Community

- Home
- /
- Analytics
- /
- Data Mining
- /
- Predictive Model Results

Topic Options

- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

01-17-2017 05:58 AM

Hi ,

I used several predictive models to in order to score the probability of an obsarvation to stop paying his bill.

I use logistic regression and forest and SVM '

The best chosen model for my population was theForest with AUC = 0.81 and misclssification rate = 0.058.

Do these results reflect good predictive ability of the model?

thanks,

Moshe

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

01-23-2017 02:08 PM

Depends on what the prediction would be without a model. If the proportion of observations with the most common target value in the data is near 1 - 0.058, then a misclassification rate of 0.058 is not good. On the other hand, if the proportion is around 1/2, then 0.058 is a great number.

I suspect AUC of 0.81 is good, because it is much larger than 0.5.

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

01-24-2017 06:16 AM

Adding to Padraic's great comments- rather than focusing on one number, you may also use the cumulative captured response values with different percentile thresholds to decide if the model is good enough.

Let's say you have a budget to take action for the top 5 percent of your population (send reminder sms, call from contact center etc). What would be the response rate of your model at the 5th percentile vs the overall event rate (random selection)? There might be cases where the model that has a lower ROC compared to the champion model will be performing better at the extreme percentiles. You may also compute the total loss (unpaid invoice) in the top buckets to justify the value of your model before deploying in production.

Tuba.