Statistical programming, matrix languages, and more

simulating multiple datasets for longitudinal analysis

Reply
Occasional Contributor
Posts: 6

simulating multiple datasets for longitudinal analysis

Dear ,

with the help of Rick Wicklin's book on simulating in SAS, I managed to simulate 1 dataset for a longitudinal analysis with three timepoints, 2 treatment groups and 5 subjects in each treatment group.

Now, I want to simulate 1000 datasets. I found something like "do SampleID = 1 to &NumSamples, then I could run the proc mixed with a by variable, but I don't succeed in completing this.

This is the code for 1 dataset. Any suggestions where I could find information?

data Interactions; WtGain = 0; /* the value of y does not matter */

do Week = 1 to 3;

  do Treatment = 1 to 2;

    do Indiv = 1 to 5;

    output;

    end;

  end;

end;

run;

proc print data=Interactions;

run;

proc sort data=Interactions;

by Treatment Indiv Week;

run;

proc print data=Interactions;

run;

proc glmmod data = Interactions noprint outparm = Parm outdesign = Design( drop = WtGain); /* DROP y */

class Week Treatment;

model WtGain = Week | Treatment;

run;

proc iml;

call randseed( 1);

use Design;

read all var _NUM_ into X;

close Design; /* Intcpt |--Week--| Treatment |-----Interactions-----| */

beta = {1, -1, -0.5, 0, -0.5, 0, -0.25, -0.25, 0, 0, 0, 0};

k = 3; /* 3 weeks thus 3 timepoints */

g=2; /*2 treatments*/

s1 = 5; /* 5 individuals in each group*/

s = g*s1; /* 10 individuals in  total*/

B = { 1    0.5  0.25 ,

      0.5  1    0.5  ,

      0.25 0.5  1 } ;

print B;

R = B; /* first block */

do i = 2 to s; /* create block-diagonal matrix */

R = block( R, b); /* with s blocks */

end;

print R;

Week = T(repeat(1:k, 1,s)); /* column: 1,2,3,1,2,3,… */

print Week ;

Indiv = colvec(repeat(T(1:s),1,k)); /* 1,1,1,2,2,2,3,3,3,… */

print Indiv ;

Treatment = colvec(repeat(T(1:g),1,s1*k));

print Treatment ;

call randseed( 1234);

create Mix var {"WtGain" "Week" "Treatment" "Indiv"}; /* name the variables */

/* random effects */

zero = j( 1, k*s, 0); /* the zero vector */

eps = RandNormal( 1, zero, R); /* eps ~ MVN( 0, R) */

print eps ;

fixed = X* beta ;

print fixed;

WtGain = fixed + T(eps) ; /* fixed effects, correlated errors */

append;

close;

quit;

* analyze the data;

proc mixed data = Mix CL;

class Indiv Week Treatment;

model WtGain = Week | Treatment/ s;

repeated / type = ar(1) subject = Indiv R;

*ods select Dimensions R CovParms;

run;size

SAS Super FREQ
Posts: 3,232

Re: simulating multiple datasets for longitudinal analysis

If I understand your question, this is Exercise 12.5 on p 236.  The answer to your question is in the middle of the page.

The complete solution of this and all exercises are included as part of the "Example Code and Data" link on the book's website

Occasional Contributor
Posts: 6

Re: simulating multiple datasets for longitudinal analysis

Thank you Rick, I never dared to expect such a quick and helpful answer! Your answer somehow crossed the alternative - long - solution that I found.

Occasional Contributor
Posts: 6

Re: simulating multiple datasets for longitudinal analysis

I found some solution, by looping and using the proc datasets with the append statement,  and using the submit statement , probably there are more elegant solutions.

data Ftests_base ;

format  Effect $14. NumDF 4. DenDF best4. Fvalue 7.2 ProbF pvalue6.4;

Effect = "." ;

NumDF = 0 ;

DenDF = 0 ;

Fvalue = 0 ;

ProbF = 0 ;

run;

proc iml;

call streaminit(1234);

do j=1 to 5 ;

  use Design;

  read all var _NUM_ into X;

  close Design; /* Intcpt |--Week--| Treatment |-----Interactions-----| */

  beta = {1, -1, -0.5, 0, -0.5, 0, -0.25, -0.25, 0, 0, 0, 0};

  k = 3; /* 3 weeks thus 3 timepoints */

  g=2; /*2 treatments*/

  s1 = 5; /* 5 individuals in each group*/

  s = g*s1; /* 10 individuals in  total*/

  B = { 1    0.5  0.25 ,

      0.5  1    0.5  ,

      0.25 0.5  1 } ;

  R = B; /* first block */

  do i = 2 to s; /* create block-diagonal matrix */

  R = block( R, b); /* with s blocks */

  end;

  Week = T(repeat(1:k, 1,s)); /* column: 1,2,3,1,2,3,… */

  Indiv = colvec(repeat(T(1:s),1,k)); /* 1,1,1,2,2,2,3,3,3,… */

  Treatment = colvec(repeat(T(1:g),1,s1*k));

  *call randseed( 1234);

  create Mix var {"WtGain" "Week" "Treatment" "Indiv"}; /* name the variables */

  /* random effects */

  zero = j( 1, k*s, 0); /* the zero vector */

  eps = RandNormal( 1, zero, R); /* eps ~ MVN( 0, R) */

  WtGain = X*beta + T(eps) ; /* fixed effects, correlated errors */

  append;

  close Mix;

  submit Mix ;

  proc mixed data = Mix CL;

  class Indiv  Treatment;

  model WtGain = Week | Treatment/ s;

  repeated / type = ar(1) subject = Indiv R;

  ods select Tests3;

  ods output Tests3=Ftests ;

  run;

  quit;

  proc datasets library=work nolist;

  append base=Ftests_base data=Ftests force;

  run;

  quit;

  endsubmit;

end;

quit;

SAS Super FREQ
Posts: 3,232

Re: simulating multiple datasets for longitudinal analysis

Regarding the efficiency of this SUBMIT/APPEND approach, you might want to read Section 6.4.4. on p 100, or the article "Simulation in SAS: The slow way or the BY way."

Post a Question
Discussion Stats
  • 4 replies
  • 298 views
  • 3 likes
  • 2 in conversation