I work in the marketing analytics team of a bank and we work mainly on test/control testing. The test is the target population which was sent offers/campaigns and the control is left untreated. My team analyzes the difference in behavior between these two groups to gauge campaign performance. Before we move into comparison, we ensure that the test and control are "like-to-like." This means that the test and control are similar in terms of a variable that is thought to be significant. For example, for a campaign that is aimed at improving checking-account balances of customers, we ensure that the test and control have equal mean checking-account balance i.e. they are "like-to-like." We have an iterative procedure that helps us do this: (1) Run proc ttest on the test/control data with the checking-account balance as the "var" variable and "TEST/CONTROL" as the "class" variable. (2) If p-value is not greater than 0.98, remove extreme observations from the appropriate group to bring about equal means. (3) Follow (1),(2) till the p value is greater than 0.98 Please note that in our process of removal, we can't lose more that 2% of the test data. I need help on two things: (1) Is this procedure valid and is there a more effective way to ensure that the test/control populations are alike in terms of a single variable? (2) Our superiors want us to take more than one variable into consideration in our "like-to-like" process. In this case, the proc ttest iterations cannot be used. How can we tackle this? Please let me know if you need further elaboration on these questions. Thanks for reading.
... View more