Hi all,
I may not have good understanding of power analysis so far. But I'm trying to run a power analysis to see what sample size would give significant success result. The logistic model looks as below approximately.
P(success)= b0+b1*Treatment+b2*Time + b3* Treatment | Time + B4* Others.
Treatment is binary.
Time is ordinal six time points 0, 1, 2, 3, 4, 5.
Others are any other categorical (essentially binary) variables.
If I wanted to do test at significant level 0.5 and power 0.8, how should I interpret the output from the code below? Thank you!
proc power;
logistic
vardist("Success") = BINOMIAL (0.5, 1)
vardist("Treatment") = binomial(0.5, 1)
vardist("time") = ordinal((0 1 3 4 5):(0 0.1 0.2 0.3 0.4))
testpredictor = "Success"
covariates = "Treatment"|"time"
responseprob = 0.4 0.5 0.6 .07
testoddsratio = 1.4
alpha = 0.05
power = 0.8
ntotal = .;
run;
Are you asking how to interpret this part of the result?
Computed N Total | |||
---|---|---|---|
Index | Response Prob | Actual Power | N Total |
1 | 0.40 | 0.800 | 1160 |
2 | 0.50 | 0.800 | 1114 |
3 | 0.60 | 0.800 | 1160 |
4 | 0.07 | 0.800 | 4242 |
First, I suspect that you might have meant to use
responseprob = 0.4 0.5 0.6 0.7
instead of the actual
responseprob = 0.4 0.5 0.6 .07
Since you provided 4 options for responseprob you have an ntotal sample size calculated for each one.
If the actual responseprob value comes in close to 0.4 then you need a sample size around 1160 and so on for each responseprob value.
If you are asking why the last value had an Ntotal of 4242, much larger than the others, it is because of the probable typo of .07 where you may have intended 0.7 . A lower probability of a response means that you need a larger sample to the needed number of "successes"
Are you asking how to interpret this part of the result?
Computed N Total | |||
---|---|---|---|
Index | Response Prob | Actual Power | N Total |
1 | 0.40 | 0.800 | 1160 |
2 | 0.50 | 0.800 | 1114 |
3 | 0.60 | 0.800 | 1160 |
4 | 0.07 | 0.800 | 4242 |
First, I suspect that you might have meant to use
responseprob = 0.4 0.5 0.6 0.7
instead of the actual
responseprob = 0.4 0.5 0.6 .07
Since you provided 4 options for responseprob you have an ntotal sample size calculated for each one.
If the actual responseprob value comes in close to 0.4 then you need a sample size around 1160 and so on for each responseprob value.
If you are asking why the last value had an Ntotal of 4242, much larger than the others, it is because of the probable typo of .07 where you may have intended 0.7 . A lower probability of a response means that you need a larger sample to the needed number of "successes"
Thanks, Ballard!
The interpretation is very helpful! Yes, I meant to use
responseprob = 0.4 0.5 0.6 0.7
Further questions I have are
1. Do the binary outcome variable and any binary covariates need to be set up as binomial(0.5, 1)?
2. I currently have vardist("time") = ordinal((0 1 3 4 5):(0 0.1 0.2 0.3 0.4)) even though there are six time points. I meant to skip the third time point (time =2). If I have all six time points and use vardist("time") = ordinal((0 1 2 3 4 5):(0 0.1 0 0.2 0.3 0.4)), does it mean the same thing as vardist("time") = ordinal((0 1 3 4 5):(0 0.1 0.2 0.3 0.4)) ? How would interpretation be different if I use vardist("time") = ordinal((0 1 3 4 5):(0 0.2 0.1 0.3 0.4)) ?
3. I believe that the current code setup only considers power of using Treatment and Time only. If I wanted to add additional covariates, I just need to include additional vardist. For example, to add the female indicator (gender effect) in the power analysis, I am thinking the code below. Am I correct?
proc power;
logistic
vardist("Success") = BINOMIAL (0.5, 1)
vardist("Treatment") = binomial(0.5, 1)
vardist("Female")=binomial(0.5,1)
vardist("time") = ordinal((0 1 3 4 5):(0 0.1 0.2 0.3 0.4))
testpredictor = "Success"
covariates = "Treatment"|"time" "Female"
responseprob = 0.4 0.5 0.6 .07
testoddsratio = 1.4
alpha = 0.05
power = 0.8
ntotal = .;
run;
Thanks a lot!
I'm going to provide a fairly generic answer to the "does this <version of option> mean the same as <different setting of option>": Try it and see.
The distribution options for Ordinal, Beta, Binomial, Exponential, Gamma etc can be any valid set of values. The values should reflect something related to your data if known (or suspected) to be similar to some values such as from reading other research. Consider that if there are three somewhat similar studies and they reported response rates for similar populations (right-handed men with brown hair) like .35, .40 and .375 and you are also going to survey right-handed men with brown hair then a response rate in that .35 to .40 would be similar. So if research showed those studies also had a binomial distribution of some characteristic with with a p of .2 and you are using the same item then you might use a binomial(.2,n).
If you don't have such information then IF you use Binomial then a p of .5 tends to be the least sensitive and would tend to increase the required sample for a given power. Other parameters might be chosen for other suspected distributions.
Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!
ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.
Find more tutorials on the SAS Users YouTube channel.