Solved: Re: proc power

karlbang · Posted 04-12-2018 12:27 PM

I am doing a sample size calculation for two proportions (alpha=0.05, beta=0.2 (i.e. power=0.8)). Group proportions are fixed at 0.05 and 0.0375, respectively. If I decide on N=4555 subjects in one group how many subjects do I need to include in the other group ?

proc power;
	ods output Power.TwoSampleFreq.Output=out;
	twosamplefreq
	GROUPPROPORTIONS = (0.05 0.0375)
	groupns=(4555 .)
	power = 0.80
	alpha = .05;
run;

fails me. Any suggestions ?

JacobSimonsen · Posted 04-17-2018 08:08 AM

I suggest to simulate with different numbers, and find the right N for which the probability for rejecting becomes 80%. I assume it is a two side test, so we can test the hypothesis of equal proportions with a likelihood test.

The calculation of p-values so simple here that it can be calculated within a datastep. It turns out that about 3865 should be in the other group in order to get a probability of rejecting=80% (that is the power).

data simulation;
  array n{2} _temporary_ (4555,3865);
  array p{2} _temporary_ (0.05,0.0375);
  array y_{2} _temporary_;
  do i=1 to 1000000;
    l0=0;
    do k=1 to 2;
	  outcome=1;y=rand('binomial',p[k],n[k]);y_[k]=y;l0+y*log(y/n[k]);
      outcome=0;y=n[k]-y; l0+y*log(y/n[k]);
	end;
	l1=(y_[1]+y_[2])*log((y_[1]+y_[2])/(n[1]+n[2]))+
	   (n[1]+n[2]-y_[1]-y_[2])*log(1-(y_[1]+y_[2])/(n[1]+n[2]));
	minus2logQ=-2*(l1-l0);
	pvalue=sdf('chisquare',minus2logQ,1);
	reject=(pvalue<0.05);
	output;
  end;
  keep minus2logQ reject;
run;
proc means data=simulation mean;
  var reject;
run;

(I edited a bit, as first I said about 4000 in the other group. Increasing the number of simulations shows that 3865 is more accurate).

View solution in original post

JacobSimonsen · Posted 04-17-2018 08:08 AM

I suggest to simulate with different numbers, and find the right N for which the probability for rejecting becomes 80%. I assume it is a two side test, so we can test the hypothesis of equal proportions with a likelihood test.

The calculation of p-values so simple here that it can be calculated within a datastep. It turns out that about 3865 should be in the other group in order to get a probability of rejecting=80% (that is the power).

data simulation;
  array n{2} _temporary_ (4555,3865);
  array p{2} _temporary_ (0.05,0.0375);
  array y_{2} _temporary_;
  do i=1 to 1000000;
    l0=0;
    do k=1 to 2;
	  outcome=1;y=rand('binomial',p[k],n[k]);y_[k]=y;l0+y*log(y/n[k]);
      outcome=0;y=n[k]-y; l0+y*log(y/n[k]);
	end;
	l1=(y_[1]+y_[2])*log((y_[1]+y_[2])/(n[1]+n[2]))+
	   (n[1]+n[2]-y_[1]-y_[2])*log(1-(y_[1]+y_[2])/(n[1]+n[2]));
	minus2logQ=-2*(l1-l0);
	pvalue=sdf('chisquare',minus2logQ,1);
	reject=(pvalue<0.05);
	output;
  end;
  keep minus2logQ reject;
run;
proc means data=simulation mean;
  var reject;
run;

(I edited a bit, as first I said about 4000 in the other group. Increasing the number of simulations shows that 3865 is more accurate).

karlbang · Posted 04-17-2018 09:54 AM

More complicated than I hoped for, but thanks