Hi ,
I used several predictive models to in order to score the probability of an obsarvation to stop paying his bill.
I use logistic regression and forest and SVM '
The best chosen model for my population was theForest with AUC = 0.81 and misclssification rate = 0.058.
Do these results reflect good predictive ability of the model?
thanks,
Moshe
Depends on what the prediction would be without a model. If the proportion of observations with the most common target value in the data is near 1 - 0.058, then a misclassification rate of 0.058 is not good. On the other hand, if the proportion is around 1/2, then 0.058 is a great number.
I suspect AUC of 0.81 is good, because it is much larger than 0.5.
Adding to Padraic's great comments- rather than focusing on one number, you may also use the cumulative captured response values with different percentile thresholds to decide if the model is good enough.
Let's say you have a budget to take action for the top 5 percent of your population (send reminder sms, call from contact center etc). What would be the response rate of your model at the 5th percentile vs the overall event rate (random selection)? There might be cases where the model that has a lower ROC compared to the champion model will be performing better at the extreme percentiles. You may also compute the total loss (unpaid invoice) in the top buckets to justify the value of your model before deploying in production.
Tuba.
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Use this tutorial as a handy guide to weigh the pros and cons of these commonly used machine learning algorithms.
Find more tutorials on the SAS Users YouTube channel.