Time dependent Propensity scores issue

salammunshi · Posted 06-14-2019 10:44 PM

Dear all modelling experts,

I have built an attrition model by using logistic regression algorithm. Considering r two months operational gap to see the probability of a customer to be churned. In model development process, I'm getting a very good model accuracy with ROC(.94) and Gini(.75).However, while I'm looking at the probability of events category - by looking at P_1.the scores are showing a bit low though the events occurred accurately. Could anyone suggest how could I adjust the probability scores i.e. if I target top 3 deciles population then I have to consider a customer who has a chance to attrite 30% but how could I reweight that customer as 90%. I have tried to reweighted by AIC or -2logL but still the scores are same. It would be really great to have an expert suggestions.

please find below the average scores distribution from model scores:

Analysis Variable : P_1 Predicted Probability: response=1Rank for Variable P_1 N Obs N Mean Std Dev Minimum Maximum

0	13828	13828	0.000458541	0.000176654	0.000253029	0.000682157
1	9468	9468	0.0011240	0.000118160	0.000732903	0.0011879
2	11772	11772	0.0014577	0.000287458	0.0012162	0.0020466
3	12338	12338	0.0022044	0.000060845	0.0021169	0.0022530
4	11304	11304	0.0037133	0.000852377	0.0022655	0.0053192
5	11802	11802	0.0087238	0.0021891	0.0054045	0.0131404
6	11682	11682	0.0150799	0.0024730	0.0131558	0.0232916
7	11843	11843	0.0411295	0.0108332	0.0236994	0.0622775
8	10159	10159	0.1500261	0.0759429	0.0623467	0.2864124
9	13409	13409	0.4677841	0.1755924	0.2919547	0.9407925

I have also provided the model output for your convenience:

Model Convergence Status
Convergence criterion (GCONV=1E-8) satisfied.
Model Fit Statistics
Criterion	Intercept Only	Intercept and Covariates
AIC	188448	90327.87
SC	188458.33	90462.22
-2 Log L	188446.00	90301.87
Testing Global Null Hypothesis: BETA=0
Test	Chi-Square	DF	Pr > ChiSq
Likelihood Ratio	98144.1267	12	<.0001
Score	89795.0913	12	<.0001
Wald	32295.415	12	<.0001

Percent Concordant	94.3	Somers' D	0.893
Percent Discordant	5	Gamma	0.9
Percent Tied	0.7	Tau-a	0.222
Pairs	6419968407	c	0.947

Partition for the Hosmer and Lemeshow Test
Group	Total	response = 1	response = 0
Observed	Expected	Observed	Expected
1	25145	44	9.51	25101	25135.49
2	18536	30	16.71	18506	18519.29
3	19557	53	25.88	19504	19531.12
4	22796	66	44.57	22730	22751.43
5	22089	84	102.85	22005	21986.15
6	22674	340	251.13	22334	22422.87
7	22745	802	709.69	21943	22035.31
8	22445	2860	3013.02	19585	19431.98
9	20979	6777	7559.15	14202	13419.85
10	30466	21967	21290.48	8499	9175.52
Hosmer and Lemeshow Goodness-of-Fit
Test
Chi-Square	DF	Pr > ChiSq
428.9279	8	<.0001

salammunshi · Posted 06-14-2019 11:42 PM

Dear all modelling experts,

I have built an attrition model by using logistic regression algorithm. Considering r two months operational gap to see the probability of a customer to be churned. In model development process, I'm getting a very good model accuracy with ROC(.94) and Gini(.75).However, while I'm looking at the probability of events category - by looking at P_1.the scores are showing a bit low though the events occurred accurately. Could anyone suggest how could I adjust the probability scores i.e. if I target top 3 deciles population then I have to consider a customer who has a chance to attrite 30% at decile 1(top 10% population) and in actual world that customer is also churning , in that case, how could I reweight that customer as 90% at decile 1. I have tried to reweighted by AIC or -2logL but still the scores are same. It would be really great to have an expert suggestions.

please find below the average scores distribution from model scores:

Row Labels Average of SCORE_1 Actually closed # RISK Customer Grand


0	0.858566898	485	16531	0.029338818
1	0.602700652	345	16322	0.021137116
2	0.459997375	321	16343	0.019641437
3	0.25116008	401	16661	0.024068183
4	0.13434664	523	16014	0.032658923
5	0.070197285	322	18175	0.017716644
6	0.039646414	429	17278	0.024829263
7	0.03082105	356	13737	0.025915411
8	0.019729811	307	17477	0.017565944
9	0.009666523	278	15341	0.018121374

Regards,

Mou

Time dependent Propensity scores issue

Time dependent Propensity scores issue

The 2025 SAS Hackathon has begun!

SAS Training: Just a Click Away