08-26-2015 02:46 AM
I am using predictive modelling to predict which customers to send out email newsletters to. I am using e.g. decision trees, logistic regression etc.
I have some datasets where all the customers (one ID per customer) have received a newsletter, which means I can do a simple random sample based on this newsletter. However, I also have datasets where the newsletter where sent out to only a subset of customers based on a predefined affinity logic. How would you deal with such a dataset? Is it even possible to use it for predictive modelling as it will not have a random representation of people who are interested in the newsletter and people who are not.
Second, how would I keep validating a model that has been created? Because the data I will now have is going to be non-random as it is based upon the model I have created. When I then want to improve the model to validate whether the model is still stable or should be changed, I guess I need to sent out the newsletter to a random selection of those who otherwise are predicted to not receive the newsletter?
Thanks a lot for your help in advance,
09-09-2015 09:32 PM
You can do an incremental response model (a.k.a. net lift model) for the data with campains when you only sent the offer to certain customers.
A paper that uses direct marketing as an example:
A paper that goes a little deeper into theroy and an example:
I hope this helps!