The Littell et al. text is not a one-day project 🙂 I'm looking forward to the next version.
Yes, I saw your message and then when I looked again, it was gone. Odd.
GLIMMIX offers all the same covariance structures as does MIXED, except for the Kronecker products. And GLIMMIX has a few more than MIXED. GLIMMIX also provides the option of sandwich estimators ("empirical"), including modifications that perform better with small samples that are not available in MIXED. In my opinion, with few exceptions, GLIMMIX is a better tool than MIXED; I rarely use MIXED anymore.
The empirical options does not work at all well for analysis of this data set. According to an old paper by Walt Stroup (and other sources):
"This approach assumes that the number of subjects per treatment is substantially greater than the number of times of observation. When the number of observation times is equal to or greater than the number of subjects per treatment, as often happens in agricultural experiments, the empirical estimate of Var(K'b) may actually be less than the model-based estimate and the resulting test-statistics may be wildly inflated."
Using ddfm=kr2 works much better. Try both and see. (In GLIMMIX, empirical=mbn works well, too, but not the classic empirical.)
You need to specify the existence of 48 subjects rather than 24, which you can do by specifying subject=subjectid(group) rather than subject=subjectid
If the response mean could theoretically change with levels of Trial1, then Trial1 needs to be included as a fixed effect in the MODEL statement.
Here is what I would consider as a first pass, not necessarily final:
/* Import data from CSV */
PROC IMPORT OUT= WORK.dissertation
DATAFILE= "Dissertation.csv"
DBMS=CSV REPLACE;
GETNAMES=YES;
DATAROW=2;
RUN;
/* Delete unfilled columns and unfilled rows */
data dissertation;
set dissertation;
drop var20-var56;
where whole_1st= 1;
run;
/* Variable transformation */
data dissertation;
set dissertation;
gmp_syl_log = log(gmp_syl);
gmp_syl_sqrt = sqrt(gmp_syl);
run;
title1 "Coding check";
proc freq data=dissertation;
table group subjectid slc block trial1;
run;
title1 "Pattern of observations";
proc tabulate data=dissertation(where=(gmp_syl ne .));
class group subjectid slc trial1 block;
table group*subjectid, slc*trial1 / misstext="X";
run;
proc sort data=dissertation;
by group subjectid slc trial1;
run;
title1 "Observed data";
proc sgpanel data=dissertation noautolegend;
panelby slc group / columns=2;
series x=trial1 y=gmp_syl / group=subjectid markers;
run;
title1 "Observed data, log scale";
proc sgpanel data=dissertation noautolegend;
panelby slc group / columns=2;
series x=trial1 y=gmp_syl_log / group=subjectid markers;
run;
proc sgpanel data=dissertation;
panelby group subjectid / columns=6;
series x=trial1 y=gmp_syl_log / group=slc markers;
run;
title1 "AR(1) among TRIAL1 levels using MIXED";
proc mixed data=dissertation covtest ;*plots=all;
class group subjectid slc trial1;
model gmp_syl_log = group|slc|trial1 / ddfm=kr2;
random intercept / subject=subjectid(group);
repeated trial1 / subject=slc*subjectid(group) type=ar(1);
lsmeans trial1 / diff adjust=simulate(seed=98375);
lsmeans group*slc ;
run;
title1 "AR(1) among TRIAL1 levels using GLIMMIX";
proc glimmix data=dissertation plots=(studentpanel boxplot(fixed student));
class group subjectid slc trial1;
model gmp_syl_log = group|slc|trial1 / ddfm=kr2;
random intercept / subject=subjectid(group);
random trial1 / subject=slc*subjectid(group) type=ar(1) residual;
lsmeans trial1 / diff adjust=simulate(seed=98375) lines;
lsmeans group*slc / plot=meanplot(sliceby=group join cl);
lsmestimate group*slc "Group effect: SLC 1 v 2" 1 -1 0 -1 1 0,
"Group effect: SLC 1 v 3" 1 0 -1 -1 0 1,
"Group effect: SLC 2 v 3" 0 1 -1 0-1 1
/ adjust=simulate(seed=29847);
run;
This model has three hierarchical design levels: Subject (assigned randomly to Group), Session within Subject (assigned--not randomly--to SLC), and RepeatedMeasure nested within Session within Subject (assigned--not randomly--to Trial1).
gmp_syl is strongly skewed, but also appears to have an upper bound that plays a bit of havoc with the distribution. I do not know how this variable was obtained, but this upper bound is something to consider.
On the log-transformed scale, there is no substantial evidence of heteroscedasticity.
AR(1) was “best” relative to CS, ARH(1), AR(1)+RE, AR(1) by SLC, TOEP based on AICc.
I show code for both MIXED and GLIMMIX (which has some nice features that are not available in MIXED).
Feel free to ask questions after you have studied the syntax, consulted the documentation and the Littell et al. text, and such.
Have fun!
... View more