Solved: Re: Permutation test - repeated measures ANOVA

jnvickery · Posted 05-20-2014 06:12 PM

Hi everyone,

I am new to IML and permutation tests. I'm working on a permutation test for repeated measures ANOVA. Below is code to

create a sample dataset and IML code to run the permutation test.

My questions...

Am I totally off base with this?

Could I better vectorize the computations?

Any comments on the appropriateness of this method?

Any comments on whether or not it could be used with count data?

Many thanks,

John

/* create test dataset */

data test;

input t1-t3;

datalines;

45 50 55

42 42 45

36 41 43

39 35 40

51 55 59

44 49 56

;;

run;

proc iml;

use test;

read all into xobt;

close test;

/* module to calculate F ratio */

start fmod;

gmean = x[:]; /* grand mean */

n = nrow(x); /* number of rows/subjects */

k = ncol(x); /* number of columns/treatments */

tmeans = x[:,]; /* treatment means */

submeans = x[,:]; /* subject means */

/* SSb */

SSb = (tmeans-gmean)##2;

SSb = n*(SSb[+]);

/* SSw */

SSw = (tmeans-x)##2;

SSw = SSw[+,];

SSw = SSw[+];

/* SSsub */

SSsub = k*((submeans-gmean)##2);

SSsub = SSsub[+];

SSerror = SSw-SSsub; /* SSerror */

MSt = SSb/(k-1); /* MSt */

MSerror = SSerror/((n-1)*(k-1)); /* MSerror */

F = MSt/MSerror; /* F statistic */

finish;

/* calculate F for observed data */

x = xobt;

run fmod;

fobt = F;

print fobt;

/* permutation test */

call randseed(1234);

x = j(nrow(xobt),ncol(xobt)); /* allocate matrix for permuted data */

B = 1000; /* number or reps */

fdist = j(B,1); /* vector to hold bootstrap dist of F */

do j = 1 to B; /* bootstrap loop */

/* permute each row */

do i = 1 to n;

p = xobt;

q = ranperm(p[i,]);

x[i,] = q;

end;

run fmod; /* calculate F on permuted data */

fdist[j,] = F; /* store calculated F in fdist vector */

end;

/* compute p-value */

pval = sum(fdist > abs(fobt)) / B;

print pval;

/* create dataset */

create bootf var {fdist};

append;

close bootf;

quit;

Rick_SAS · Posted 05-21-2014 10:45 AM

First, welcome to the world of SAS/IML programming! You might be interested in these tips for beginners. I write a blog about statistical programming in SAS, and many of my posts are "Getting Started Examples."

Second, you seem to be an excellent programmer. Your code is pretty good already, and you've clearly researched the problem and figured out the basics of the IML language. I think you'll be an expert who answers OTHER people's questions before long!

Here are some hints and suggestions that I think would improve the program:

1) Turn the FMOD module into a function with a local symbol table. For example,

start fmod(x);

...

return(F);

finish;

2) Vectorizing the inner loop (do i=1 to n) is tricky because there isn't a built-in SAS/IML function for "permute each row".

If your data sets are small, I would leave your code the way it is. Only if you are dealing with many rows (large n) will you want to improve the efficiency. If you are dealing with large n, or just want to learn more, read on. Otherwise, you're done.

The RANPERM function supports a second argument that specifies the number of permutations to return. I would use this feature to vectorize the inner loop. Instead of permuting the ELEMENTS of p, permute the column indices {1 2 3}. You then need to convert the COLUMN indices into MATRIX indices by adding a constant that depends on the row. (See the article "Converting matrix subscripts to indices" for terminology and the computations.) I wrapped these ideas into the function PermuteWithinRows that you can call from within the bootstrap loop. I've attached output to illustrate the process.

proc iml;

use test; read all into xobt; close test;

/* independently permute elements of each row of a matrix */
start PermuteWithinRows(m);
   colIdx = ranperm(1:ncol(m), nrow(m));
   /* generate matrix indices */
   f = (row(m) - 1) * ncol(m); /* ROW fcn in 12.3 */
   matIdx = f + colIdx;
   *print colIdx " + " f " = " matIdx;

return( shape(m[matIdx], nrow(m)) );
finish;

call randseed(1234);
x = PermuteWithinRows(xobt);
print xobt " ==> " x;

View solution in original post

Rick_SAS · Posted 05-21-2014 06:05 AM

If I get some time I'll look at this more closely, but for now look at pp 11-14 of http://support.sas.com/resources/papers/proceedings10/329-2010.pdf to see an example of a matched pair permutation test. Not exactly what you are doing here, but some of the ideas might be helpful.

jnvickery · Posted 05-21-2014 09:51 AM

Thanks, Rick! My apologies. I should have mentioned in the original post that your paper was my starting point. I used the ranperm function instead of the randgen as in the paper to permute the rows of the matrix.

Rick_SAS · Posted 05-21-2014 10:45 AM

First, welcome to the world of SAS/IML programming! You might be interested in these tips for beginners. I write a blog about statistical programming in SAS, and many of my posts are "Getting Started Examples."

Second, you seem to be an excellent programmer. Your code is pretty good already, and you've clearly researched the problem and figured out the basics of the IML language. I think you'll be an expert who answers OTHER people's questions before long!

Here are some hints and suggestions that I think would improve the program:

1) Turn the FMOD module into a function with a local symbol table. For example,

start fmod(x);

...

return(F);

finish;

2) Vectorizing the inner loop (do i=1 to n) is tricky because there isn't a built-in SAS/IML function for "permute each row".

If your data sets are small, I would leave your code the way it is. Only if you are dealing with many rows (large n) will you want to improve the efficiency. If you are dealing with large n, or just want to learn more, read on. Otherwise, you're done.

The RANPERM function supports a second argument that specifies the number of permutations to return. I would use this feature to vectorize the inner loop. Instead of permuting the ELEMENTS of p, permute the column indices {1 2 3}. You then need to convert the COLUMN indices into MATRIX indices by adding a constant that depends on the row. (See the article "Converting matrix subscripts to indices" for terminology and the computations.) I wrapped these ideas into the function PermuteWithinRows that you can call from within the bootstrap loop. I've attached output to illustrate the process.

proc iml;

use test; read all into xobt; close test;

/* independently permute elements of each row of a matrix */
start PermuteWithinRows(m);
   colIdx = ranperm(1:ncol(m), nrow(m));
   /* generate matrix indices */
   f = (row(m) - 1) * ncol(m); /* ROW fcn in 12.3 */
   matIdx = f + colIdx;
   *print colIdx " + " f " = " matIdx;

return( shape(m[matIdx], nrow(m)) );
finish;

call randseed(1234);
x = PermuteWithinRows(xobt);
print xobt " ==> " x;

jnvickery · Posted 05-21-2014 01:51 PM

Thanks so much! This is super helpful. Converting the module to a function and using matrix indices definitely improved the performance. The actual data to be used will likely have around 150 rows so I will apply your techniques.

Thanks also for your encouragement and the link to the "Getting Started Examples". Much appreciated!

SteveDenham · Posted 05-23-2014 10:14 AM

This is sweet--it addresses a nagging problem I have had with the permutations in a hierarchical design problem as well.

Steve Denham

Rick_SAS · Posted 06-02-2014 08:30 AM

Here's a blog post that I wrote about this problem:

Permute elements within each row of a matrix - The DO Loop

Ready to join fellow brilliant minds for the SAS Hackathon?