Thank you Reeza, Below is what i went through to construct the model. getting the churn variable 1. I got all active subscribers for may 2012. Checked all that are still present in August = Non-churn and those not present are the churn. 2. Using the churn variable for may I constructed the churn prediction variables using data for the months of Mar,April, and May. With these I got usage statistics per susbsciber for each month, constructed means, ratios and I ended up with about 500 variables. the data used included revenues for voice, sms, data, value added services etc broken into on-same-network, other-networks, international. 3. I subjected the constructed data to the model development process with SAS enterprise miner. The model had the following nodes a. sample node: with rare event oversampled to 25% from 4.9% using stratified sampling (total sample is about one million subscribers). b.data partition = training 50%, varidation 30% and testing 20% c.Principle components and variable transformation using the distribution. d.Decision tree, nueral network and regression models from the above nodes. one decision tree from the partitioned data. The Decision tree from the variable transformation perfomed best with missclassfication rate = 0.18 I scored the model and applied results to data for June. The results are presented in the file attached. Thank you again for you help
... View more