11-11-2015 12:28 PM - edited 11-11-2015 01:14 PM
I work in the marketing analytics team of a bank and we work mainly on test/control testing. The test is the target population which was sent offers/campaigns and the control is left untreated. My team analyzes the difference in behavior between these two groups to gauge campaign performance.
Before we move into comparison, we ensure that the test and control are "like-to-like." This means that the test and control are similar in terms of a variable that is thought to be significant. For example, for a campaign that is aimed at improving checking-account balances of customers, we ensure that the test and control have equal mean checking-account balance i.e. they are "like-to-like." We have an iterative procedure that helps us do this:
(1) Run proc ttest on the test/control data with the checking-account balance as the "var" variable and "TEST/CONTROL" as the "class" variable.
(2) If p-value is not greater than 0.98, remove extreme observations from the appropriate group to bring about equal means.
(3) Follow (1),(2) till the p value is greater than 0.98
Please note that in our process of removal, we can't lose more that 2% of the test data.
I need help on two things:
(1) Is this procedure valid and is there a more effective way to ensure that the test/control populations are alike in terms of a single variable?
(2) Our superiors want us to take more than one variable into consideration in our "like-to-like" process. In this case, the proc ttest iterations cannot be used. How can we tackle this?
Please let me know if you need further elaboration on these questions. Thanks for reading.
11-11-2015 04:05 PM
If your significant variable really makes a difference in the behavior of your customers, then pairing would give you more power in your comparison. To enable a paired comparison, each tested customer should be paired with one or more controls on the basis of your significant variable(s).
If you want to involve more than one variable, one way is to define a distance that combines the variables into a single measure.
11-16-2015 04:23 PM
This falls in the general area of causal modeling of observational data. SAS tools (macros) for this have been developed by the Rand Corp. See their TWANG project page.