Hi, I have build an attrition model and I am evaluating its perfomance. I have sorted the probabilities from high to low, dividing the customers into ten equally-sized groups called “deciles”, such that ten percent of the customer base is con-tained in each decile, and observing model performance in terms of attrition rate by decile. Using the code below.. proc rank data=OUT groups=10 out=OUT_DECILE descending; var P; ranks decile; run; data OUT_DECILE; set OUT_DECILE; decile=decile+1; run; proc means data=OUT_DECILE n mean sum; var LAPSE; class decile; run; I have run it first on the training dataset used to build model and I get this below. Then I have scored the validation dataset, and rank the probabilities again and I get this below. Shall I not get roughly the same % per decile? Is my model not performing well then? I have used the gain chart to compare Validation and Trainig but they looked fine? Your help woul be much appreciated . Many Thanks Analysis Variable - LAPSE : Training Sample Rank for Variable N Obs Decile Mean Overall Mean pred 1 20,986 79% 30% 2 21,014 70% 30% 3 20,999 38% 30% 4 21,041 29% 30% 5 17,839 25% 30% 6 24,168 22% 30% 7 20,952 20% 30% 8 21,013 12% 30% 9 20,998 7% 30% 10 20,990 5% 30% Analysis Variable : LAPSE : Validation Sample Rank for Variable N Obs Decile Mean Overall Mean pred 1 9,034 100% 21% 2 9,034 100% 21% 3 9,034 13% 21% 4 8,968 0% 21% 5 10,822 0% 21% 6 7,312 0% 21% 7 9,033 0% 21% 8 9,014 0% 21% 9 9,055 0% 21% 10 9,034 0% 21% I hve attached the gain chart. Many Thanks
... View more