- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
After reading through the "Computational Methods and Formulas" for proc power within the User Guide as well as other posts within these communities about sample size calculation, I'm still having a hard time understanding why there is a discrepancy between the sample size calculated by SAS 9.4 and the one I obtain by hand.
I understand that onesamplefreq results in different values because the formula used by SAS accounts for type II error by including zpower whereas the formula typically used for hand calculations of this does not.
My problem is that I do not understand where the discrepancy when using onesamplemeans is introduced.
Example
If I'm trying to calculate sample size needed for a halfwidth of 50 (stddev=250) with 95% confidence by hand I do:
N = [z(1-alpha/2) * sigma / halfwidth]^2 = [1.96*250/50]^2 = 96.04 > 97
Using proc power I do:
proc power;
onesamplemeans ci = t
halfwidth = 50
stddev=250
probwidth= 0.95
ntotal=.
;
run
which gets 120
Deriving a formula for sample size from the half-width formula provided in the User Guide for onesamplemeans results in essentially the same equation I'm using for hand calculations. While the difference does not really matter to me in this instance, I would greatly appreciate any help in understanding how SAS is arriving at this number so that I know for studies I'm designing later this year.
Thanks,
aaron
Accepted Solutions
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
To verify power or sample size calculations I like to do a simulation. For your question this is fairly easy:
%let alpha=0.05; /* significance level */
%let s=250; /* standard deviation of normal variate */
%let h=50; /* target maximum half-width of CI for the mean */
%let n=120; /* sample size */
%let n_alt=97; /* alternative lower sample size */
%let nsim=1e5; /* number of samples */
/* Simulate &NSIM samples of &N normal random variates each */
data sim / view=sim;
call streaminit(27182818);
do i=1 to ≁
do k=1 to &n;
x=rand('normal',0,&s);
output;
end;
end;
run;
/* Compute sample means and sample standard deviations */
proc summary data=sim;
by i;
*where k<=&n_alt;
var x;
output out=stats(drop=_:) mean=m std=s;
run;
/* Compute relevant characteristics of CIs for the mean */
data ci;
set stats;
h=tinv(1-&alpha/2,&n-1)*s/sqrt(&n); /* half-width of CI */
valid=(-h<=m<=h); /* indicator "CI contains true mean" */
narrow=(h<=&h); /* indicator "CI is sufficiently narrow" */
run;
/* Check quality criteria for the CIs */
ods exclude BinomialTest;
proc freq data=ci;
tables valid narrow / bin(level='1');
tables valid*narrow;
run;
Results:
With the sample size n=120 suggested by PROC POWER a proportion of 0.9493 (95% CI 0.9479 to 0.9506) of the 100,000 simulated confidence intervals were valid, i.e., contained the true mean zero. Also, a proportion of 0.9520 (95% CI 0.9506 to 0.9533) of the confidence intervals had a half-width of at most 50. Good.
However, after reducing the sample size to your suggested n=97 (by activating the WHERE statement commented out above) the two proportions deteriorate to 0.9209 (95% CI 0.9192 to 0.9226) and 0.9343 (95% CI 0.9327 to 0.9358), respectively. So, there is strong evidence that n=97 is insufficient if the goal is 0.95.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Hello @atbeczkiewicz and welcome to the SAS Support Communities!
I haven't checked your formula, but my "hand" (i.e. Base SAS) calculation of the power* obtained with sample sizes 97 (your suggestion), 119 and 120 (PROC POWER result) using the formula in the documentation (section Confidence Interval for Mean (CI=T), Pr(half-width<=h)=..., two-sided) would look something like this:
data chk;
h=50;
sigma=250;
alpha=0.05;
do n=97, 119, 120;
power=cdf('chisq',h**2*n*(n-1)/(sigma**2*tinv(1-alpha/2,n-1)**2),n-1);
output;
end;
run;
Result:
Obs h sigma alpha n power 1 50 250 0.05 97 0.47682 2 50 250 0.05 119 0.94304 3 50 250 0.05 120 0.95130
So, for a power of 0.95 you'd need n=120 -- exactly the result of PROC POWER.
Edit:
* (sloppy use of the term "power" in the sense "probability that half-width of CI does not exceed h")
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
To verify power or sample size calculations I like to do a simulation. For your question this is fairly easy:
%let alpha=0.05; /* significance level */
%let s=250; /* standard deviation of normal variate */
%let h=50; /* target maximum half-width of CI for the mean */
%let n=120; /* sample size */
%let n_alt=97; /* alternative lower sample size */
%let nsim=1e5; /* number of samples */
/* Simulate &NSIM samples of &N normal random variates each */
data sim / view=sim;
call streaminit(27182818);
do i=1 to ≁
do k=1 to &n;
x=rand('normal',0,&s);
output;
end;
end;
run;
/* Compute sample means and sample standard deviations */
proc summary data=sim;
by i;
*where k<=&n_alt;
var x;
output out=stats(drop=_:) mean=m std=s;
run;
/* Compute relevant characteristics of CIs for the mean */
data ci;
set stats;
h=tinv(1-&alpha/2,&n-1)*s/sqrt(&n); /* half-width of CI */
valid=(-h<=m<=h); /* indicator "CI contains true mean" */
narrow=(h<=&h); /* indicator "CI is sufficiently narrow" */
run;
/* Check quality criteria for the CIs */
ods exclude BinomialTest;
proc freq data=ci;
tables valid narrow / bin(level='1');
tables valid*narrow;
run;
Results:
With the sample size n=120 suggested by PROC POWER a proportion of 0.9493 (95% CI 0.9479 to 0.9506) of the 100,000 simulated confidence intervals were valid, i.e., contained the true mean zero. Also, a proportion of 0.9520 (95% CI 0.9506 to 0.9533) of the confidence intervals had a half-width of at most 50. Good.
However, after reducing the sample size to your suggested n=97 (by activating the WHERE statement commented out above) the two proportions deteriorate to 0.9209 (95% CI 0.9192 to 0.9226) and 0.9343 (95% CI 0.9327 to 0.9358), respectively. So, there is strong evidence that n=97 is insufficient if the goal is 0.95.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Thanks @FreelanceReinh!