I am trying to figure out if i should do any oversampling or under-sampling in data-set where my event of non interest if very small. So i am trying to predict churn of customers. People who churned constitute 80%. the other 20% did not churn. My goal is to score people with churn probabilty (would use that in calculation of lifetime value).
What should I do? ( I know how to do things in sas code, and I have eg and em as well, so how to do part in software i can do)