Dear all modelling experts,
I have built an attrition model by using logistic regression algorithm. Considering r two months operational gap to see the probability of a customer to be churned. In model development process, I'm getting a very good model accuracy with ROC(.94) and Gini(.75).However, while I'm looking at the probability of events category - by looking at P_1.the scores are showing a bit low though the events occurred accurately. Could anyone suggest how could I adjust the probability scores i.e. if I target top 3 deciles population then I have to consider a customer who has a chance to attrite 30% but how could I reweight that customer as 90%. I have tried to reweighted by AIC or -2logL but still the scores are same. It would be really great to have an expert suggestions.
please find below the average scores distribution from model scores:
0 | 13828 | 13828 | 0.000458541 | 0.000176654 | 0.000253029 | 0.000682157 |
1 | 9468 | 9468 | 0.0011240 | 0.000118160 | 0.000732903 | 0.0011879 |
2 | 11772 | 11772 | 0.0014577 | 0.000287458 | 0.0012162 | 0.0020466 |
3 | 12338 | 12338 | 0.0022044 | 0.000060845 | 0.0021169 | 0.0022530 |
4 | 11304 | 11304 | 0.0037133 | 0.000852377 | 0.0022655 | 0.0053192 |
5 | 11802 | 11802 | 0.0087238 | 0.0021891 | 0.0054045 | 0.0131404 |
6 | 11682 | 11682 | 0.0150799 | 0.0024730 | 0.0131558 | 0.0232916 |
7 | 11843 | 11843 | 0.0411295 | 0.0108332 | 0.0236994 | 0.0622775 |
8 | 10159 | 10159 | 0.1500261 | 0.0759429 | 0.0623467 | 0.2864124 |
9 | 13409 | 13409 | 0.4677841 | 0.1755924 | 0.2919547 | 0.9407925 |
I have also provided the model output for your convenience:
Model Convergence Status | |||
Convergence criterion (GCONV=1E-8) satisfied. | |||
Model Fit Statistics | |||
Criterion | Intercept Only | Intercept and Covariates | |
AIC | 188448 | 90327.87 | |
SC | 188458.33 | 90462.22 | |
-2 Log L | 188446.00 | 90301.87 | |
Testing Global Null Hypothesis: BETA=0 | |||
Test | Chi-Square | DF | Pr > ChiSq |
Likelihood Ratio | 98144.1267 | 12 | <.0001 |
Score | 89795.0913 | 12 | <.0001 |
Wald | 32295.415 | 12 | <.0001 |
Percent Concordant | 94.3 | Somers' D | 0.893 |
Percent Discordant | 5 | Gamma | 0.9 |
Percent Tied | 0.7 | Tau-a | 0.222 |
Pairs | 6419968407 | c | 0.947 |
Partition for the Hosmer and Lemeshow Test | |||||
Group | Total | response = 1 | response = 0 | ||
Observed | Expected | Observed | Expected | ||
1 | 25145 | 44 | 9.51 | 25101 | 25135.49 |
2 | 18536 | 30 | 16.71 | 18506 | 18519.29 |
3 | 19557 | 53 | 25.88 | 19504 | 19531.12 |
4 | 22796 | 66 | 44.57 | 22730 | 22751.43 |
5 | 22089 | 84 | 102.85 | 22005 | 21986.15 |
6 | 22674 | 340 | 251.13 | 22334 | 22422.87 |
7 | 22745 | 802 | 709.69 | 21943 | 22035.31 |
8 | 22445 | 2860 | 3013.02 | 19585 | 19431.98 |
9 | 20979 | 6777 | 7559.15 | 14202 | 13419.85 |
10 | 30466 | 21967 | 21290.48 | 8499 | 9175.52 |
Hosmer and Lemeshow Goodness-of-Fit | |||||
Test | |||||
Chi-Square | DF | Pr > ChiSq | |||
428.9279 | 8 | <.0001
|
|
Dear all modelling experts,
I have built an attrition model by using logistic regression algorithm. Considering r two months operational gap to see the probability of a customer to be churned. In model development process, I'm getting a very good model accuracy with ROC(.94) and Gini(.75).However, while I'm looking at the probability of events category - by looking at P_1.the scores are showing a bit low though the events occurred accurately. Could anyone suggest how could I adjust the probability scores i.e. if I target top 3 deciles population then I have to consider a customer who has a chance to attrite 30% at decile 1(top 10% population) and in actual world that customer is also churning , in that case, how could I reweight that customer as 90% at decile 1. I have tried to reweighted by AIC or -2logL but still the scores are same. It would be really great to have an expert suggestions.
please find below the average scores distribution from model scores:
Row Labels Average of SCORE_1 Actually closed # RISK Customer Grand
0 | 0.858566898 | 485 | 16531 | 0.029338818 |
1 | 0.602700652 | 345 | 16322 | 0.021137116 |
2 | 0.459997375 | 321 | 16343 | 0.019641437 |
3 | 0.25116008 | 401 | 16661 | 0.024068183 |
4 | 0.13434664 | 523 | 16014 | 0.032658923 |
5 | 0.070197285 | 322 | 18175 | 0.017716644 |
6 | 0.039646414 | 429 | 17278 | 0.024829263 |
7 | 0.03082105 | 356 | 13737 | 0.025915411 |
8 | 0.019729811 | 307 | 17477 | 0.017565944 |
9 | 0.009666523 | 278 | 15341 | 0.018121374 |
Regards,
Mou
Available on demand!
Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Select SAS Training centers are offering in-person courses. View upcoming courses for: