04-19-2014 10:54 PM
I have a data set in excel,
there ise a target value on this data set, churners=1, non-churner=0
I am a very beginner in SAS Enterperise Miner,
So I need to someone to help me, its very urgent for me pls.
I attached the data, CHURN column is my target value (flag)
I want to understand which customers are churning,
I want to come up with initial findings on how churners are different than non-churners.
I must find at least one or two model, to understand the which columns are important for churners etc.
04-21-2014 11:48 AM
Are you looking at real data or just toy dataset? You may want to look into association rules - there are many papers SAS and academic, as well as someone in the telecom area.
04-23-2014 10:39 AM
As a new user of Enterprise Miner, a good place to start is the Getting Started guide. It will walk you through an example using some data preparation nodes, modeling nodes, model comparison, and the scoring of new observations. Find the correct version here: SAS Enterprise Miner
If at some point you have time stamps in your data, you might be interested in this SAS Global Forum paper on survival data mining. This technique helps to predict not just if a customer will churn, but when: http://support.sas.com/resources/papers/proceedings12/132-2012.pdf
04-23-2014 01:53 PM
Good suggestions so far. Laura's suggestion is very good.
With a binary variable, many analysts (and business users) like the output from the Decision Tree node. It can help you identify which variables might be important factors for churn and give you a model to score new data.
Here is some good reading:http://support.sas.com/publishing/pubcat/chaps/57587.pdf
A REALLY good book to pick up is http://www.amazon.com/Data-Mining-Techniques-Relationship-Management/dp/0470650931. It has a great chapter on decision trees and likely covers survival analysis too.
Your data set has character variables that I *think* should be numeric. For example, Revenue would look like 22.000.000 which I think should have been $22,000,000 (or 22000000)?
When you import the data into EM, make sure you spend the time to set the roles and levels of each variable. Churn would be Target and Binary. There are a number of other variables that should be set to Binary too.
Customer could be set to a role of ID.
ChurnDep should be Rejected because it is redundant to churn. The decision tree will split once on that variable and stop splitting.
Calibrat should also likely be rejected.