Hi everyone,
I have a mixed model experiment that involves some repeated measurements I'm trying to find the appropriate code. I have a split-plot experiment involving 54 rows of lettuce. I have 3 light supplemental light treatments with 18 rows each. Within each light treatment I had two nitrogen treatments by 3 lettuce varieties. So 9 rows got 1 nitrogen treatment and another 9 got the other nitrogen treatment. Each nitrogen treatment had 3 varieties of lettuce and I had 3 replications. There was 3 rows with the same lettuce variety in each nitrogen treatment.
Every 7 days I took instantaneous gas exchange measurements to measure photosynthesis rate, transpiration rate, stomachs conductance, and chlorophyll content. I took 1 measurement on 1 plant per row each week for 5 weeks. I'm treating each row as an experimental unit.
I would like analyze these measurements as repeated measurements to see how the light and nitrogen treatments affect them over time. I'm new to repeated measurements so I've been reading SAS for Mixed Models by Littel and Milliken et al. 2006.
proc mixed data = cycle1.fallphysiology;
CLASS Rep Light Ncon Var DAT;
MODEL A = Light | Ncon | Var | DAT/DDFM=Satterthwaite;
random Rep(Light) Ncon*Var*Rep(Light);
Repeated DAT / subject=Ncon*Var*Rep(Light) type=vc r rcorr;
RUN;
Rep = replication
Ncon = nitrogen concentration
Var = variety
DAT = days after transplanting or every 7 days
When I run this code however almost all of the effects are significant which both my advisor and I find suspicious.
So I'm trying to figure out if this model is appropriate. I believe I have the right subject since this should be the same row in each week. I've kept the covariance as independent because I measured a different plant each week. So from my understanding these are not technically repeated measurements and there shouldn't be much coloration between weeks. However, I'd still like to treat them as repeated measurement if it makes sense to do so. I think I may also need a random statement. I've seen some examples use and others not
I've included one of my datasets along with the log and output of this code.
Any advice on if this model is appropriate for this experiment would be greatly appreciated.
Thanks!
I suspect that you ultimately want to make inferences about the effects of L and N at the overall population level which suggests using a marginal, population-averaged model like a GEE model. That could be done with a model like the following using your CHANNEL variable which seems to be unique for each row. This treats the 3 rows in each L/N/V/week combo as independent. It includes DAT (week) as a continuous, linear effect in the model since this is reflects time. TYPE=AR allows the correlations across the weeks to depend on the amount of time between measurements, but you could choose some other structure.
proc gee data = fallphysiology;
CLASS channel Light Ncon Var;
MODEL A = Light | Ncon | Var | DAT / type3;
Repeated subject=channel / type=ar corrw;
RUN;
Note that the estimated correlations are very small or even negative suggesting that there might not be a need to allow for correlation among the repeated measurements. Indeed, allowing correlation might further reduce the standard errors, making some effects even more significant. The following fits the model without allowing correlations among repeated measurements. As you noted with your mixed model analysis, many effects in the model are still highly significant, including multiple interactions, presumably because the variability in the data is even smaller than the effect sizes. So, the code also plots the fitted model showing the effect of each of L, N, and V over time within each combination of the other two variables, and provides tests of those effects.
proc genmod data = fallphysiology;
CLASS Light Ncon Var;
MODEL A = Light | Ncon | Var | DAT / type3;
effectplot slicefit(x=dat plotby=ncon*var);
slice Light*Ncon*Var / sliceby=ncon*var plots=none;
effectplot slicefit(x=dat plotby=light*var);
slice Light*Ncon*Var / sliceby=light*var plots=none;
effectplot slicefit(x=dat plotby=light*ncon);
slice Light*Ncon*Var / sliceby=light*ncon plots=none;
RUN;
Thanks for reply! To clarify channel is the just number of each row. It doesn't influence the result. I forgot to change the column header. Would this change the model? Why would the variable rep not be included?
It is the row that is repeatedly measured (5 times - once at each week/DAT), so the row identifier variable (you confirmed that that is what the CHANNEL variable is) is what should be in the SUBJECT= option to indicate the sets of observations that could be correlated. CHANNEL is not used as a predictor in the model. The sole purpose of the SUBJECT= effect is to identify which observations are correlated and which are independent when fitting the GEE model. REP appears to only indicate the 3 replicates within one combo of L/N/V/DAT with values 1,2,3 in any one such combo. It is not needed in the GEE model to differentiate correlated/uncorrelated observations and presumably would not be a predictor of interest if the REP=1 observations do not have some meaningful distinction from REP=2 observations. But as I noted, the correlation found by the GEE analysis among the observations on a row seems to be essentially nil, so it could be argued that neither a GEE nor a mixed model is needed.
It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.
ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.
Find more tutorials on the SAS Users YouTube channel.