11-18-2015 10:28 AM
I have a website A/B test coming up. The test is this.
--I'm testing sale messaging content slot placement on a Home Page
--I have 2 versions of the Home Page created to show different audiences
-- I want to be able to determine "by noon" that the results will confirm my lift hypothesis. The information used to confirm this I think would be revenue generated from those who viewed the HP in their visit
Again, On the day a test launches, I want to know if I can determine by noon if it’s successful and statistically significant. I also want to be able to predict ahead of time how long something needs to run to be significant. I'm looking for advice on how to approach this. We use SAS in our office. For Direct mail campains I use a power anysis to calculate sample size but in this situation I’m not sure how to determine sample size and the length of time it will take to reach significance
From what I’ve been researching I am pretty sure power analysis should work but there are steps in the process im not sure I understand.
The literature I’ve been reading says:
The literature then says the power analysis will estimate the sample size needed to meet these conditions. From there, you should be able to use page-view metrics to convert this estimate into time. I’m not sure how to interpret that last statement.
For a DM A/B test I use this code for test and control groups.
proc power; TwoSampleFreq Test=Fisher Alpha = 0.05 Sides = U GroupProportions = ( /*Historical Benchmark: Response Rate*/ /*lift*/) Power =.8 Npergroup =.;run;
Any examples, assistance with this or sample code will be greatly appreciated. Thanks!
11-18-2015 01:31 PM
From there, you should be able to use page-view metrics to convert this estimate into time. I’m not sure how to interpret that last statement.
Basically looking at something like page hits per hour. If you get 100 hits per hour and you need 500 views for your use that translates to 5 hours. So if you need it by noon you need to start 5 hours prior.
Of course your page hits are possibly influenced by time of day but that may give you enough to start.
The effect size is going to be related on how you measure preference or whatever the two versions are supposed to garner. It might be difference between some sort of like scale (paired tests if each person views both pages) or the overall average scores. Something like "switch if new page has average score 0.4 or more larger on 1 to 10 scale than current". Not that I'm making any recommendation just an example from a planning or requirements document as an example.
11-18-2015 01:49 PM
This is great ballardw, thank you. I am still a little confused as how to determine the sample size or page hits. Using your example how would I determine that I need 500 views to reach statistical significance?
11-18-2015 02:31 PM
hanks pearsoninst ! Real quick: out of the Proc Power examples on that website, which example would you suggest I follow for this