07-21-2015 05:49 PM
I am confused about PROC POWER logistic. For example, I am examine the association between smoking and cardiovascular disease. I have the following data.
I would like to use below the syntax:
vardist("smoking") = binomial (p,n)
testpredictor = "smoking"
responseprob = 0.60
testoddsratio = 1.5
ntotal = 250
power = .;
In SAS support, it says "
is a binomial distribution with probability of success and number of independent Bernoulli trials . The value of must be greater than 0 and less than 1, and must be an integer greater than 0. "
What should the p and n be in this example?
07-22-2015 08:38 AM
There are a couple of ways to go here, I think. Pick one of the two categories as the reference category. For instance, if it is the SMOKING=NO category, the values would be p=50/130 and n would be 130. Another would be to use the overall proportions, so that p=150/250=0.6 and n=250.
This is untested, but since PROC POWER runs rapidly, I would start here and see what the results look like.
07-22-2015 11:24 AM
Thanks for your response.
Since 250 and 0.6 is already used in the syntax, we may not want to repeat the information in the binomial (p ,n) option.
Run the syntax with p = 50/130 and n = 130, power is >0.999.
Because binomial is the distribution of smoking (exposure here), run the syntax with p = 120/250, n = 250, power is >0.999.
However, run the following syntax, obtain a power of 0.263:
refproportion = 0.2
groupns = (100 150)
power = .;
Theoretically, the LOGISTIC option and the TWOSAMPLEFREQ option in the PROC POWER should be equivalent. Not sure why the results are inconsistent!!!
07-22-2015 12:47 PM
Yes, but the OR in this data is 8. Plugging that into the twosamplefreq gives power>0.999, so I would guess that the two are returning roughly the same thing (although the twosamplefreq is probably comparing to an OR of 1, rather than the testoddsratio of 1.5).
So, I think this design probably does have >.999 power to detect a difference in the OR of 8 vs 1. Does that make sense?
To me, the next logical thing to do is to turn it inside-out, and calculate sample sizes for given OR, and I get some really weird results (N=15 for 80% power) but I think that might be for a unit change in the predictor, so I shifted to a standard deviation change (DEFAULTUNIT = SD), and now I get approximately 83% power to detect a shift from 1.5, or an N of 230 for 80% power. I have to admit that I don't have a gut feel for sample sizes in logistic regression like I do for two sample comparisons, but this appears reasonable based on everything else here.