Fluorite | Level 6

## Help, Attrition Model Performance in SAS. Thanks

Hi,

I have build an attrition model and I am evaluating its perfomance. I have sorted the probabilities from high to low, dividing the customers into ten
equally-sized groups called “deciles”, such that ten percent of the customer base is con-tained in each decile, and observing model performance in terms of attrition rate by decile. Using the code below..

proc rank data=OUT groups=10 out=OUT_DECILE descending;

var P; ranks decile;

run;

data OUT_DECILE;

set OUT_DECILE;

decile=decile+1;

run;

proc means data=OUT_DECILE n mean sum;

var LAPSE;

class decile;

run;

I have run it first on the training dataset used to build model and I get this below. Then I have scored the validation dataset, and rank the probabilities again and I get this below. Shall I not get roughly the same % per decile? Is my model not performing well then? I have used the gain chart to compare Validation and Trainig but they looked fine? Your help woul be much appreciated . Many Thanks

 Analysis Variable  - LAPSE : Training Sample Rank for Variable N Obs Decile Mean Overall Mean pred 1 20,986 79% 30% 2 21,014 70% 30% 3 20,999 38% 30% 4 21,041 29% 30% 5 17,839 25% 30% 6 24,168 22% 30% 7 20,952 20% 30% 8 21,013 12% 30% 9 20,998 7% 30% 10 20,990 5% 30%

 Analysis Variable : LAPSE : Validation Sample Rank for Variable N Obs Decile Mean Overall Mean pred 1 9,034 100% 21% 2 9,034 100% 21% 3 9,034 13% 21% 4 8,968 0% 21% 5 10,822 0% 21% 6 7,312 0% 21% 7 9,033 0% 21% 8 9,014 0% 21% 9 9,055 0% 21% 10 9,034 0% 21%

I hve attached the gain chart. Many Thanks

3 REPLIES 3
Obsidian | Level 7

## Re: Help, Attrition Model Performance in SAS. Thanks

Knowing nothing else, it seems to me that your training model is not generalizing well to the validation set. Which is usually a sign of overfitting.

What tool are you using to create the initial model, and what technique?

Fluorite | Level 6

## Re: Help, Attrition Model Performance in SAS. Thanks

Hi,

I am using SAS and Logistic Regression. But the gain chart is showing that the model is robust. Please See attached

Many Thanks

Alice

Fluorite | Level 6

## Re: Help, Attrition Model Performance in SAS. Thanks

Hi,                 Did you use the train decile definition for validation data. For example for training data, the first decile the min and max probabilities were say 0.9 - 0.95. Then you should use the same decile definition for Validation data.                  If you have used great, else use the train dataa decile definitions to compare train with validation.        Best Regars,      Amit
Discussion stats
• 3 replies
• 1548 views
• 0 likes
• 3 in conversation