Statistical Procedures

Bim · Posted 12-04-2018 09:24 AM

I need to calculate the power of equivalence test for paired data. I use the following code:

/* ************************************************************************ */

/* This program is used to calculate the power of an equivalence test */

/* between two groups of paired data based the following data: */

/* mean difference and standard deviation */

/* of the differences between each pair; */

/* coefficient of correlation between the two groups; */

/* sample size (number of pairs); */

/* significance level, and */

/* acceptance interval. */

/* proc POWER is used for the calculation */

*/ */

*/************************************************************************* */

proc power;

pairedmeans

TEST=EQUIV_DIFF

meandiff=1

stddev=2

power = . /*to be calculated*/

npairs =3

corr = 0.9 0.1

alpha =0.05

upper = 7

lower =-7;

ods output output = pow;

run;

In the SAS documentation the paired equivalence test is not described very clear.

My questions are:

In my code I use the standard deviation for differences (first, a difference is calculated for each of 3 pairs, and then the standard deviation is calculated from 3 differences). Is it correct to use the standard deviation for differences or should I use the common standard deviation per group?

How the correlation coefficient is used in the calculation of power?
What is a formula for the power calculated in proc power?

FreelanceReinh · Posted 12-05-2018 03:27 PM

@Bim wrote:

I have one more question.

If the correlation is zero in the proc power for paired equivalents test, should I expect that the calculated power of the paired equivalents test is the same as the power for not paired equivalents test?

When I run the following two codes I get different results. The power for the paired test is 0.710 while the power for the not paired equivalents test is 0.899.

Why there is a difference, if the standard deviations, mean difference and sample size are the same in both cases?

The reason for the difference is that different tests are performed: The equivalence test is based on the paired t-test in the former case and on the two-sample t-test in the latter case. The paired t-test uses a different estimator for the standard error than the two-sample t-test (see the formulas for SE in sections "One-Sample Design" and "Two-Independent-Sample Design" of the PROC TTEST documentation). Note that in the tests the standard deviations are estimated from the samples. Our specification sigma=2 is not used there.

As a consequence, the t-statistics have different denominators (while the numerators are the same). Essentially, the different terms are

the corrected sum of squares of the paired differences (i.e. n squares) in the case of the paired t-test
the corrected sum of squares of the samples in group 1 plus the corrected sum of squares of the samples in group 2 (i.e. 2n squares) in the case of the two-sample t-test.

With our assumption of zero correlation these two may even have the same expected value, but the crucial point is that the degrees of freedom are different: n-1 vs. 2(n-1). In your example with the small n=3 the difference between the corresponding t-distributions (i.e. with df=2 vs. 4) is particularly large.

It's not difficult to verify the power calculation with a simulation (using the two different tests):

%let n=3;
%let mu_diff=1;
%let stddev=2;
%let theta_L=-7;
%let theta_U=7;
%let alpha=0.05;
%let nsim=1e6;

proc format;
value signif
low-&alpha   = 'significant'
&alpha<-high = 'not significant';
run;

data pairs;
call streaminit(27182818);
do i=1 to &nsim;
  do id=1 to &n;
    x1=rand('normal',0,&stddev);
    x2=rand('normal',&mu_diff,&stddev);
    output;
  end;
end;
run;

proc transpose data=pairs out=twosamp(drop=id rename=(col1=x)) name=grp;
by i id;
var x1 x2;
run;

ods graphics off;
ods exclude all;
ods noresults;

proc ttest data=pairs tost(&theta_L, &theta_U);
by i;
paired x1*x2;
ods output EquivTests=eqtst1(where=(test='Overall'));
run;

proc ttest data=twosamp tost(&theta_L, &theta_U);
by i;
class grp;
ods output EquivTests=eqtst2(where=(variances='Equal' & test='Overall'));
run;

ods exclude none;
ods results;

ods exclude BinomialTest;
proc freq data=eqtst1;
format probt signif.;
tables probt / binomial;
run;

ods exclude BinomialTest;
proc freq data=eqtst2;
format probt signif.;
tables probt / binomial;
run;

Result of the first PROC FREQ step (paired-sample design):

                               P-Value

                                            Cumulative    Cumulative
          Probt    Frequency     Percent     Frequency      Percent
--------------------------------------------------------------------
significant          709895       70.99        709895        70.99
not significant      290105       29.01       1000000       100.00


      Binomial Proportion
      Probt = significant

Proportion                0.7099
ASE                       0.0005
95% Lower Conf Limit      0.7090
95% Upper Conf Limit      0.7108

Exact Conf Limits
95% Lower Conf Limit      0.7090
95% Upper Conf Limit      0.7108

Sample Size = 1000000

Result of the second PROC FREQ step (two-independent-sample design):

                               P-Value

                                            Cumulative    Cumulative
          Probt    Frequency     Percent     Frequency      Percent
--------------------------------------------------------------------
significant          898786       89.88        898786        89.88
not significant      101214       10.12       1000000       100.00


      Binomial Proportion
      Probt = significant

Proportion                0.8988
ASE                       0.0003
95% Lower Conf Limit      0.8982
95% Upper Conf Limit      0.8994

Exact Conf Limits
95% Lower Conf Limit      0.8982
95% Upper Conf Limit      0.8994

Sample Size = 1000000

Note that I used PROC TRANSPOSE to create the independent samples from the paired samples so that identical data are used for the two tests. The first components of the three pairs comprise group 1 (grp='x1') and the second components of the three pairs comprise group 2 (grp='x2') in the two-independent-sample design.

View solution in original post

FreelanceReinh · Posted 12-04-2018 01:43 PM

Hello @Bim and welcome to the SAS Support Communities!

Thanks for asking these interesting questions.

@Bim wrote:

In my code I use the standard deviation for differences (first, a difference is calculated for each of 3 pairs, and then the standard deviation is calculated from 3 differences). Is it correct to use the standard deviation for differences or should I use the common standard deviation per group?

The documentation says that STDDEV "specifies the standard deviation assumed to be common to both members of a pair." So, in general it is not correct to enter the standard deviation of the differences here*. You can either specify a common standard deviation (STDDEV= option) or the individual standard deviation of each member of the pair (PAIREDSTDDEVS= option, alias: PSTDS=) if the two sigmas are different. So, a specification pstds=(2,2) would be equivalent to stddev=2. (I verified this with your example.)

_____________________

* Edit: A second look at the formulas (see answers to the other questions) reveals: Both the individual standard deviations and the correlation coefficient are used in the power calculation only through the standard deviation of the differences. So, you can enter that value in the STDDEV= option if you specify CORR=0.5 because then sigma_diff=sqrt(2*sigma**2-2*0.5*sigma**2)=sigma.

_____________________

@Bim wrote:

How the correlation coefficient is used in the calculation of power?

It is used in the formula for the standard deviation of the differences (which is calculated from the two individual standard deviations and the correlation coefficient). See the second link in my answer to your next question.

@Bim wrote:
2. What is a formula for the power calculated in proc power?

You can find the formula in the PROC POWER documentation --> Details --> Computational Methods and Formulas
--> Analyses in the PAIREDMEANS Statement, section "Additive Equivalence Test for Mean Difference with Normal Data (TEST=EQUIV_DIFF)."

Just for fun, let's check if we can replicate the result from PROC POWER by implementing the formula using only Base SAS. Let's assume, just as an example, that the common standard deviation was 2 (as specified in your code).

proc print data=pow;
id corr;
var power;
format power 12.10;
run;

Result:

Corr           Power

 0.9    0.9999975766
 0.1    0.7509335978

Now our own computation:

data param;
N=3;
alpha=0.05;
mu_diff=1;
rho=0.9; /* rho=0.1; */
sigma1=2;
sigma2=2;
sigma_diff=sqrt(sigma1**2+sigma2**2-2*rho*sigma1*sigma2);
theta_U=7;
theta_L=-7;
arg1=tinv(1-alpha,N-1);
arg2a=(mu_diff-theta_U)/(sigma_diff/sqrt(N));
arg2b=(mu_diff-theta_L)/(sigma_diff/sqrt(N));
arg3=sqrt(N-1)*(theta_U-theta_L)/(2*sigma_diff/sqrt(N)*arg1);
f=sqrt(2*constant('PI'))/(gamma((N-1)/2)*2**((N-3)/2));
call symputx('N',N);
call symputx('t',arg1);
call symputx('delta1',arg2a);
call symputx('delta2',arg2b);
call symputx('b',arg3);
run;

%put &=N;
%put &=t;
%put &=delta1;
%put &=delta2;
%put &=b;

%macro func(x);
y=cdf('normal',-&t*&x/sqrt(&N-1)-&delta1)*&x**(&N-2)*pdf('normal',&x);
%mend func;

%AdpSimps(0,&b,1E-12);
%let q1=&integral;

%macro func(x);
y=cdf('normal',&t*&x/sqrt(&N-1)-&delta2)*&x**(&N-2)*pdf('normal',&x);
%mend func;

%AdpSimps(0,&b,1E-12);
%let q2=&integral;

%put &=q1;
%put &=q2;

data check;
set param;
power=f*(&q1-&q2);
run;

proc print data=check noobs;
var power;
format power 12.10;
run;

(The code of macro %AdpSimps can be found in this SESUG 2007 paper.)

Result:

       power

0.9999975765

And with rho=0.1:

       power

0.7509335977

So, the results from the "manual" calculation differ by only 1E-10 from those obtained with PROC POWER, suggesting that PROC POWER indeed uses this formula.

Bim · Posted 12-05-2018 07:23 AM

Thank you very much for your thorough reply. It has helped a lot.

I have one more question.

If the correlation is zero in the proc power for paired equivalents test, should I expect that the calculated power of the paired equivalents test is the same as the power for not paired equivalents test?

When I run the following two codes I get different results. The power for the paired test is 0.710 while the power for the not paired equivalents test is 0.899.

Why there is a difference, if the standard deviations, mean difference and sample size are the same in both cases?

proc power;
pairedmeans
TEST=EQUIV_DIFF
meandiff=1
pstds=(2,2)
power = . /* to be calculated*/
npairs =3
corr = 0
alpha =0.05
upper = 7
lower =-7;
ods output output = power_paired;
run;

proc power;
twosamplemeans
TEST=EQUIV_DIFF
meandiff=1
stddev=2
power = . /* to be calculated*/
ntotal =6
alpha =0.05
upper = 7
lower =-7;
ods output output = power_twosample;

run;

FreelanceReinh · Posted 12-05-2018 03:27 PM

@Bim wrote:

I have one more question.

If the correlation is zero in the proc power for paired equivalents test, should I expect that the calculated power of the paired equivalents test is the same as the power for not paired equivalents test?

When I run the following two codes I get different results. The power for the paired test is 0.710 while the power for the not paired equivalents test is 0.899.

Why there is a difference, if the standard deviations, mean difference and sample size are the same in both cases?

The reason for the difference is that different tests are performed: The equivalence test is based on the paired t-test in the former case and on the two-sample t-test in the latter case. The paired t-test uses a different estimator for the standard error than the two-sample t-test (see the formulas for SE in sections "One-Sample Design" and "Two-Independent-Sample Design" of the PROC TTEST documentation). Note that in the tests the standard deviations are estimated from the samples. Our specification sigma=2 is not used there.

As a consequence, the t-statistics have different denominators (while the numerators are the same). Essentially, the different terms are

the corrected sum of squares of the paired differences (i.e. n squares) in the case of the paired t-test
the corrected sum of squares of the samples in group 1 plus the corrected sum of squares of the samples in group 2 (i.e. 2n squares) in the case of the two-sample t-test.

With our assumption of zero correlation these two may even have the same expected value, but the crucial point is that the degrees of freedom are different: n-1 vs. 2(n-1). In your example with the small n=3 the difference between the corresponding t-distributions (i.e. with df=2 vs. 4) is particularly large.

It's not difficult to verify the power calculation with a simulation (using the two different tests):

%let n=3;
%let mu_diff=1;
%let stddev=2;
%let theta_L=-7;
%let theta_U=7;
%let alpha=0.05;
%let nsim=1e6;

proc format;
value signif
low-&alpha   = 'significant'
&alpha<-high = 'not significant';
run;

data pairs;
call streaminit(27182818);
do i=1 to &nsim;
  do id=1 to &n;
    x1=rand('normal',0,&stddev);
    x2=rand('normal',&mu_diff,&stddev);
    output;
  end;
end;
run;

proc transpose data=pairs out=twosamp(drop=id rename=(col1=x)) name=grp;
by i id;
var x1 x2;
run;

ods graphics off;
ods exclude all;
ods noresults;

proc ttest data=pairs tost(&theta_L, &theta_U);
by i;
paired x1*x2;
ods output EquivTests=eqtst1(where=(test='Overall'));
run;

proc ttest data=twosamp tost(&theta_L, &theta_U);
by i;
class grp;
ods output EquivTests=eqtst2(where=(variances='Equal' & test='Overall'));
run;

ods exclude none;
ods results;

ods exclude BinomialTest;
proc freq data=eqtst1;
format probt signif.;
tables probt / binomial;
run;

ods exclude BinomialTest;
proc freq data=eqtst2;
format probt signif.;
tables probt / binomial;
run;

Result of the first PROC FREQ step (paired-sample design):

                               P-Value

                                            Cumulative    Cumulative
          Probt    Frequency     Percent     Frequency      Percent
--------------------------------------------------------------------
significant          709895       70.99        709895        70.99
not significant      290105       29.01       1000000       100.00


      Binomial Proportion
      Probt = significant

Proportion                0.7099
ASE                       0.0005
95% Lower Conf Limit      0.7090
95% Upper Conf Limit      0.7108

Exact Conf Limits
95% Lower Conf Limit      0.7090
95% Upper Conf Limit      0.7108

Sample Size = 1000000

Result of the second PROC FREQ step (two-independent-sample design):

                               P-Value

                                            Cumulative    Cumulative
          Probt    Frequency     Percent     Frequency      Percent
--------------------------------------------------------------------
significant          898786       89.88        898786        89.88
not significant      101214       10.12       1000000       100.00


      Binomial Proportion
      Probt = significant

Proportion                0.8988
ASE                       0.0003
95% Lower Conf Limit      0.8982
95% Upper Conf Limit      0.8994

Exact Conf Limits
95% Lower Conf Limit      0.8982
95% Upper Conf Limit      0.8994

Sample Size = 1000000

Note that I used PROC TRANSPOSE to create the independent samples from the paired samples so that identical data are used for the two tests. The first components of the three pairs comprise group 1 (grp='x1') and the second components of the three pairs comprise group 2 (grp='x2') in the two-independent-sample design.

Bim · Posted 12-10-2018 02:26 AM

Thank you! The replies were very helpful and have solved my problem.

Statistical Procedures

proc power

Re: proc power

Re: proc power

Re: proc power

Re: proc power

Re: proc power

Implementation power solar system

NMG Consulting - AI Powered Text Mining Strategy

Why Is The Power of My Mixture Experiment So Low?

Leverage the Power of SAS® With SingleStore

PROC POWER - Twosamplefreq NULLPROPORTION

Follow Us

What is...

Statistical Procedures

Our biggest data and AI event of the year.

Follow Us

What is...