BookmarkSubscribeRSS Feed
☑ This topic is solved. Need further help from the community? Please sign in and ask a new question.
JoakimE
Obsidian | Level 7

Hi,

 

I have calculating power for a study involving active one treatment and one control treatment affecting a dichotomous yes/no response variable. At baseline it is believed that the proportions for both variables are p1=p2=0.125. At the post-treatment measurement the hypothesis is that the proportions would be p1=0.8 (for the active treatment) and p2=0.125 (for the control treatment). This will be tested using Fisher's exact test with a power of 80%. The randomization ratio of active to placebo should be 2:1. I use the following code for this:


proc power;
TWOSAMPLEFREQ
test=fisher
alpha=0.05
GROUPPROPORTIONS = (0.8 0.125)
GROUPWEIGHTS = (2 1)
power = 0.8
NTOTAL=.;
run;

 

which result in a total sample size of 24 (16 active and 8 placebo). However, this figure does not match what I get when I use the PASS software. There I get n1=13 and n2=6. The reason for the discrepancy is that PASS uses calculations based on exact permutations of the binomial distribution for the response variable and not Walters normal approximation that proc power uses (when I use the normal approximation method in PASS I also get 24 total sample size there).

 

How can I get proc power to calculate the required sample size based on exact permutations for TWOSAMPLEFREQ? I tried METHOD=exact, but it doesn't work. Seems like a serious shortcoming in proc power if this is not possible, since the reason we want to use Fisher's exact test is due to a small sample size where the normal approximation is not advisable.

 

Many thanks for some input 🙂

1 ACCEPTED SOLUTION

Accepted Solutions
FreelanceReinh
Jade | Level 19

Hi @JoakimE,

 

I don't think you can get the exact results with PROC POWER (in the current version), sadly. However, thanks to the small sample sizes, you can perform the necessary calculations with PROC FREQ and DATA steps, as I did it for a similar request last month. Here's a quick adaptation of that code to your problem (please review):

%let nmin=4;     /* minimum sample size for placebo group */
%let nmax=8;     /* maximum sample size for placebo group */
%let alpha=0.05; /* significance level */
%let p1=0.8;     /* success probability in active group */
%let p2=0.125;   /* success probability in placebo group */

/* Create a dataset with all (2*n+e+1)*(n+1)-2 possible combinations of i=0, ..., 2*n+e successes (r=1) in the
   active group (g=1, n1=2*n+e) and j=0, ..., n successes in the placebo group (g=2, n2=n), except the two 
   extreme cases i=j=0 and i=2*n+e & j=n -- for all values of n between &nmin and &nmax and e=0, 1 */

data comb;
do n=&nmin to &nmax;
  do e=0 to 1;
    do i=0 to 2*n+e;
      do j=0 to n;
        if i=j=0 | i=2*n+e & j=n then continue;
        g=1;
        r=0; c=2*n+e-i; output;
        r=1; c=i;       output;
        g=2;
        r=0; c=n-j; output;
        r=1; c=j;   output;
      end;
    end;
  end;
end;
run;

/* Compute Fisher's exact test for each combination */

ods select none;
ods output FishersExact=fisher(where=(name1='XP2_FISH') keep=n e i j name1 nvalue1 rename=(nvalue1=p));
proc freq data=comb;
by n e i j;
tables g*r / chisq;
weight c;
run;
ods select all;

/* Compute power based on the joint distribution of two independent
   Bin(2*n+e,&p1) and Bin(n,&p2) distributed random variables */

data power(keep=n1 n2 power);
retain n1 n2 power;
set fisher;
by n e;
where .<p<=&alpha; /* The two extreme cases excluded above would not meet this condition anyway. */
power+pdf('binom',i,&p1,2*n+e)*pdf('binom',j,&p2,n);
if last.e then do;
  n1=2*n+e;
  n2=n;
  output;
  power=0;
end;
run;

Results:

Obs    n1    n2     power

  1     8     4    0.35123
  2     9     4    0.47768
  3    10     5    0.64501
  4    11     5    0.72224
  5    12     6    0.78531
  6    13     6    0.80349
  7    14     7    0.81705
  8    15     7    0.79312
  9    16     8    0.90324
 10    17     8    0.89930

The results suggest that, indeed, n1=13 and n2=6 are correct. 

View solution in original post

6 REPLIES 6
PaigeMiller
Diamond | Level 26

If you check the documentation for PROC POWER (which should always be done before asking here), you would see that Fisher's exact test is indeed available for TWOSAMPLEFREQ.

 

The TWOSAMPLEFREQ statement performs power and sample size analyses for tests of two independent proportions. The Farrington-Manning score, Pearson’s chi-square, Fisher’s exact, and likelihood ratio chi-square tests are supported.

 

Use option TEST=FISHER

--
Paige Miller
JoakimE
Obsidian | Level 7

You should always read the question before answering here 😉 (pardon the salty tone)

 

The question was not whether or not Fisher's exact test is available with TWOSAMPLEFREQ, but how to avoid the normal approximation as the default power calculation method (see my original post).

FreelanceReinh
Jade | Level 19

Hi @JoakimE,

 

I don't think you can get the exact results with PROC POWER (in the current version), sadly. However, thanks to the small sample sizes, you can perform the necessary calculations with PROC FREQ and DATA steps, as I did it for a similar request last month. Here's a quick adaptation of that code to your problem (please review):

%let nmin=4;     /* minimum sample size for placebo group */
%let nmax=8;     /* maximum sample size for placebo group */
%let alpha=0.05; /* significance level */
%let p1=0.8;     /* success probability in active group */
%let p2=0.125;   /* success probability in placebo group */

/* Create a dataset with all (2*n+e+1)*(n+1)-2 possible combinations of i=0, ..., 2*n+e successes (r=1) in the
   active group (g=1, n1=2*n+e) and j=0, ..., n successes in the placebo group (g=2, n2=n), except the two 
   extreme cases i=j=0 and i=2*n+e & j=n -- for all values of n between &nmin and &nmax and e=0, 1 */

data comb;
do n=&nmin to &nmax;
  do e=0 to 1;
    do i=0 to 2*n+e;
      do j=0 to n;
        if i=j=0 | i=2*n+e & j=n then continue;
        g=1;
        r=0; c=2*n+e-i; output;
        r=1; c=i;       output;
        g=2;
        r=0; c=n-j; output;
        r=1; c=j;   output;
      end;
    end;
  end;
end;
run;

/* Compute Fisher's exact test for each combination */

ods select none;
ods output FishersExact=fisher(where=(name1='XP2_FISH') keep=n e i j name1 nvalue1 rename=(nvalue1=p));
proc freq data=comb;
by n e i j;
tables g*r / chisq;
weight c;
run;
ods select all;

/* Compute power based on the joint distribution of two independent
   Bin(2*n+e,&p1) and Bin(n,&p2) distributed random variables */

data power(keep=n1 n2 power);
retain n1 n2 power;
set fisher;
by n e;
where .<p<=&alpha; /* The two extreme cases excluded above would not meet this condition anyway. */
power+pdf('binom',i,&p1,2*n+e)*pdf('binom',j,&p2,n);
if last.e then do;
  n1=2*n+e;
  n2=n;
  output;
  power=0;
end;
run;

Results:

Obs    n1    n2     power

  1     8     4    0.35123
  2     9     4    0.47768
  3    10     5    0.64501
  4    11     5    0.72224
  5    12     6    0.78531
  6    13     6    0.80349
  7    14     7    0.81705
  8    15     7    0.79312
  9    16     8    0.90324
 10    17     8    0.89930

The results suggest that, indeed, n1=13 and n2=6 are correct. 

JoakimE
Obsidian | Level 7

Thanks FreelanceReinhard, that was a neat solution! Something that works for small sample sizes at least. For larger sample sizes I guess the normal approximation or Chi2 test would suffice anyway. Just a bit strange that SAS has not thought of this problem I think...

mariko5797
Pyrite | Level 9
Could you clarify what each letter stands for? I am trying to do something similar with differing sample sizes for active and placebo.

Infection Rates:
Active 0.0, 0.05, 0.10, 0.15, 0.20, 0.25
Placebo 1/N, 0.99

Sample Size (N):
Active 15 to 12
Placebo 7 to 10
FreelanceReinh
Jade | Level 19

Hello @mariko5797,

 

Always interesting to reread one's own code after a fairly long time ... :-)

 

  • g is the group identifier with values 1 for the active group and 2 for the placebo group.
  • n1 is the number of subjects in the active group.
  • n2=n is the number of subjects in the placebo group.
  • e=n1-2*n, which is either 0 or 1 because the assumption was a 2:1 randomization ratio of active vs. placebo. (The case e=1 actually relaxes the exact 2:1 ratio a bit so that n1 is not restricted to even numbers.)
  • r is the response variable with values 1 for success and 0 for failure.
  • c is the number of subjects for a particular combination of g, n, e and r in dataset COMB.
  • i is the number of successes in the active group.
  • j is the number of successes in the placebo group.

So, in your case you don't need variable e. Just combine all possible values of n1=12,...,15 and n2=7,...,10, i.e., 16 combinations. With varying infection rates (in both treatment groups) you'll have two additional ("outer") DO loops and similarly two additional BY variables.

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

What is ANOVA?

ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 6 replies
  • 2241 views
  • 2 likes
  • 4 in conversation