was hoping someone had any ideas on the following.
let's say we have monthly mailings plus a holdout cell over a period of 12 months.
we measure response of mailed vs control at aggregate level, but how would i go about building something to rank sites based on how likely they are to respond. In essence to identify the key characteristics (firmagraphics, behaviours etc) that are proving the most responsive and to rank a site based on the impact of DM on their response.
having trouble formulating the model i want to capture the impact of the mailings as opposed to those that spent vs those that didn't
Did think about fitting a logistic regression model with responded/not responded as an outcome and the characteristics you mention (firmagraphics, behaviours etc) as predictors?
In fact, logistic regression models the probability of event, so in your case it's the model for probability of site responding, and you can rank the sites based on their predicted probabilities. Also you can put the sites into categories according to the predicted probabilities of responding, for example low:<10%, mid:10-30%, high30%.
unfortunately that model is still looking at who is most likely to start spending, with an overall mailing effect.
What i want is for the model to measure how that 'mailing' coefficient varies across the firmagraphics etc
As i see it, these are 2 separate approaches,
- the 1st measures how likely a site is to start spending, which is the natural rate if you like.
- the 2nd measures how much the mailing impacts whether they start spending (if it does at all!)
I could just measure the differences in uplift at some pre-determined level (e.g. industry by size), however i need to take into account the repeated mailings and a model would also allow me to throw more variables into the mix.