10-17-2011 07:06 PM
In SAS 9.1.3 I want to use PROC POWER for designing an A/B test for two treatments which produce proportions of success. Based on my STAT 301 textbook, I was looking for two-sample t-test for proportions, but I didn't see it. Should I be using "TwoSampleFreq test=pchi" instead?
I expect the control group to have a 36.6% success rate and want the minimum number of samples needed to show the test group is not much worse, but I'm confused by the null and alternative hypothesis. It is good if the two groups are the same because this means we can save money on a cheaper treatement, but normally the null hypothesis means no difference and it is "good" to reject the null hypothesis. In other words, isn't it wrong to have "The null hypothesis is the two treatments have the same success rate" and try to not reject h0?
10-17-2011 08:43 PM
I really should leave this for the staisticians, of which I am NOT, but isn't the correct test the proportions test which is simply a z-score using a normal distribution as the criterion for the test?
Regardless, one doesn't reject a null hypothesis. One can only fail to accept it.
10-18-2011 11:02 AM
To compare two treatments/proportions, my statistics textbook uses a t-test and a t-table, but as n is large (in my case I am expecting about 500), the t approximates z.
10-18-2011 11:12 AM
What you are really asking about here is controlling the type II error (1 - Power), basically a test of "equivalence". Since your desire is to fail to reject the null hypothesis, you want the power to be fairly high.
The z-score and the chi square are the same test for a two group proportion comparison.
Using a t-test on proportions is an approximation to the z-score. However, if you have a big enough sample, the t-test converges to the z-score. Because you want to fail to reject with a high degree of confidence, the two would yield nearly the same results.
There has been a lot written in the medical literature about testing equivalence and the hazards that it can entail (E.g. is the question "no worse" or "about the same"?). Before you launch a study, you should read some of the issues discussed in the medical literature (Do a google scholar search for R Temple (author) and equivalence in the title. He discusses the issues in a readable fashion and has references to the appropriate statistical literature.).
10-18-2011 01:18 PM
Thank you for the article. I started reading it, and it looks promising. The parts about ethics and placebos are not directly applicable because this is a marketing test.
"Placebo-Controlled Trials and Active-Control Trials in the Evaluation of New Treatments. Part 1: Ethical and Scientific Issues"
Robert Temple, MD; and Susan S. Ellenberg, PhD
So I will read this and try to frame the question better with my customer. I think my customer needs to agree to some decision threshold around which the test can be designed. In other words, "If the test group is X units worse, then we will make the treatment the new BAU. Otherwise, we will continue with the old BAU."