Hi All, Thank you for the quick response. I don't think I previously included the hypothesis of interest: Is there is a significant difference between combined groups 2 and 3 versus 1? I realized that the IDs were identical within the treatment groups so when I combined the two treatment groups it assumed that certain subjects had multiple assessments at each time point (ie, that there were only 10 subjects in the newly created treatment group and therefore only 20 subjects total). I have updated my code to make the subject IDs unique. This experimental design assumes 10 subjects are randomized to 3 treatment groups (ie, 30 subjects total). If I am interested in comparing two pooled groups versus one group I am wondering how the interpretation between the two following models differs? The LSMD estimate is the same, but the SEs differ. I am wondering how to understand the difference between these two models. My gut is to use the estimate statement because that follows the experimental design, but I am wondering if there is another reason beyond that or if I should use the pooled treatment groups variable instead? data newtest;
call streaminit(33445);
do id=1 to 10;
rid=rand('normal'); *random effect for subject=id;
do trt= 1 to 3;
if trt in (2,3) then trt2=2;
else trt2=trt;
do time=1 to 2;
y=trt + trt*time + rand('normal') + rid;
output;
end;
end;
end;
run;
data newtest;
set newtest;
id = id * trt + (11*trt);
run;
proc mixed data=newtest method=reml;
class id trt time;
model y = trt time trt*time/ s ddfm=kr covb;
repeated time/ type=un subject=id(trt);
lsmeans trt*time / diff;
estimate 'test1' trt 1 -0.5 -0.5
trt * time 0 1
0 -.5
0 -0.5 /e;
run;
proc mixed data=newtest method=reml;
class id trt2 time;
model y = trt2 time trt2*time/ s ddfm=kr covb;
repeated time/ type=un subject=id(trt2);
lsmeans trt2*time / diff e;
run; The results I get follow: The first model Estimates Standard Label Estimate Error DF t Value Pr > |t| test1 -4.7338 0.5992 27 -7.90 <.0001 The second model Differences of Least Squares Means Standard Effect TRT2 TIME _TRT2 _TIME Estimate Error DF t Value Pr > |t| TRT2*TIME 1 1 1 2 -1.0561 0.4529 28 -2.33 0.0271 TRT2*TIME 1 1 2 1 -3.7173 0.7552 28 -4.92 <.0001 TRT2*TIME 1 1 2 2 -5.7899 0.7797 35 -7.43 <.0001 TRT2*TIME 1 2 2 1 -2.6612 0.8034 34 -3.31 0.0022 TRT2*TIME 1 2 2 2 -4.7338 0.8265 28 -5.73 <.0001 TRT2*TIME 2 1 2 2 -2.0725 0.3203 28 -6.47 <.0001 Another part of the question is also what if you want to perform pairwise comparisons as an exploratory analysis. Would you want to use contrast statements to obtain those LSMDs or would you run the model using only the subjects in the treatment groups of interest? In this case again, one gets the same LSMD estimate but the SE and DF are different. proc mixed data=newtest method=reml;
class id trt time;
model y = trt time trt*time/ s ddfm=kr ;
repeated time/ type=un subject=id(trt);
lsmeans trt*time / diff;
estimate 'test2' trt 0 1 -1
trt * time 0 0
0 1
0 -1 /e;
run;
proc mixed data=newtest method=reml;
where trt in (2 3);
class id trt time;
model y = trt time trt*time/ s ddfm=kr ;
repeated time/ type=un subject=id(trt);
lsmeans trt*time / diff;
run; The output from the estimate statement (model 1): Estimates Standard Label Estimate Error DF t Value Pr > |t| test2 -3.5463 0.6919 27 -5.13 <.0001 The output from the subset model (model 2): Differences of Least Squares Means Standard Effect TRT TIME _TRT _TIME Estimate Error DF t Value Pr > |t| TRT*TIME 2 1 2 2 -1.5477 0.3855 18 -4.02 0.0008 TRT*TIME 2 1 3 1 -2.4967 0.6730 18 -3.71 0.0016 TRT*TIME 2 1 3 2 -5.0940 0.7234 23.5 -7.04 <.0001 TRT*TIME 2 2 3 1 -0.9489 0.7234 23.5 -1.31 0.2023 TRT*TIME 2 2 3 2 -3.5463 0.7705 18 -4.60 0.0002 TRT*TIME 3 1 3 2 -2.5973 0.3855 18 -6.74 <.0001 I greatly appreciate everyone's insight.
... View more