Re: Seeking advice on repeated measurement code

Joshua42 · Posted 09-24-2025 11:34 PM

Hi everyone,

I have a mixed model experiment that involves some repeated measurements I'm trying to find the appropriate code. I have a split-plot experiment involving 54 rows of lettuce. I have 3 light supplemental light treatments with 18 rows each. Within each light treatment I had two nitrogen treatments by 3 lettuce varieties. So 9 rows got 1 nitrogen treatment and another 9 got the other nitrogen treatment. Each nitrogen treatment had 3 varieties of lettuce and I had 3 replications. There was 3 rows with the same lettuce variety in each nitrogen treatment.

Every 7 days I took instantaneous gas exchange measurements to measure photosynthesis rate, transpiration rate, stomachs conductance, and chlorophyll content. I took 1 measurement on 1 plant per row each week for 5 weeks. I'm treating each row as an experimental unit.

I would like analyze these measurements as repeated measurements to see how the light and nitrogen treatments affect them over time. I'm new to repeated measurements so I've been reading SAS for Mixed Models by Littel and Milliken et al. 2006.

proc mixed data = cycle1.fallphysiology;
CLASS Rep Light Ncon Var DAT;
MODEL A = Light | Ncon | Var | DAT/DDFM=Satterthwaite;
random Rep(Light) Ncon*Var*Rep(Light); 
Repeated DAT / subject=Ncon*Var*Rep(Light) type=vc r rcorr;
RUN;

Rep = replication

Ncon = nitrogen concentration

Var = variety

DAT = days after transplanting or every 7 days

When I run this code however almost all of the effects are significant which both my advisor and I find suspicious.

So I'm trying to figure out if this model is appropriate. I believe I have the right subject since this should be the same row in each week. I've kept the covariance as independent because I measured a different plant each week. So from my understanding these are not technically repeated measurements and there shouldn't be much coloration between weeks. However, I'd still like to treat them as repeated measurement if it makes sense to do so. I think I may also need a random statement. I've seen some examples use and others not

I've included one of my datasets along with the log and output of this code.

Any advice on if this model is appropriate for this experiment would be greatly appreciated.

Thanks!

StatDave · Posted 09-25-2025 04:33 PM

I suspect that you ultimately want to make inferences about the effects of L and N at the overall population level which suggests using a marginal, population-averaged model like a GEE model. That could be done with a model like the following using your CHANNEL variable which seems to be unique for each row. This treats the 3 rows in each L/N/V/week combo as independent. It includes DAT (week) as a continuous, linear effect in the model since this is reflects time. TYPE=AR allows the correlations across the weeks to depend on the amount of time between measurements, but you could choose some other structure.

proc gee data = fallphysiology;
CLASS channel Light Ncon Var;
MODEL A = Light | Ncon | Var | DAT / type3;
Repeated subject=channel / type=ar corrw;
RUN;

Note that the estimated correlations are very small or even negative suggesting that there might not be a need to allow for correlation among the repeated measurements. Indeed, allowing correlation might further reduce the standard errors, making some effects even more significant. The following fits the model without allowing correlations among repeated measurements. As you noted with your mixed model analysis, many effects in the model are still highly significant, including multiple interactions, presumably because the variability in the data is even smaller than the effect sizes. So, the code also plots the fitted model showing the effect of each of L, N, and V over time within each combination of the other two variables, and provides tests of those effects.

proc genmod data = fallphysiology;
CLASS Light Ncon Var;
MODEL A = Light | Ncon | Var | DAT / type3;
effectplot slicefit(x=dat plotby=ncon*var);
slice Light*Ncon*Var / sliceby=ncon*var plots=none;
effectplot slicefit(x=dat plotby=light*var);
slice Light*Ncon*Var / sliceby=light*var plots=none;
effectplot slicefit(x=dat plotby=light*ncon);
slice Light*Ncon*Var / sliceby=light*ncon plots=none;
RUN;

Joshua42 · Posted 09-25-2025 11:42 PM

Thanks for reply! To clarify channel is the just number of each row. It doesn't influence the result. I forgot to change the column header. Would this change the model? Why would the variable rep not be included?

StatDave · Posted 09-26-2025 12:36 PM

It is the row that is repeatedly measured (5 times - once at each week/DAT), so the row identifier variable (you confirmed that that is what the CHANNEL variable is) is what should be in the SUBJECT= option to indicate the sets of observations that could be correlated. CHANNEL is not used as a predictor in the model. The sole purpose of the SUBJECT= effect is to identify which observations are correlated and which are independent when fitting the GEE model. REP appears to only indicate the 3 replicates within one combo of L/N/V/DAT with values 1,2,3 in any one such combo. It is not needed in the GEE model to differentiate correlated/uncorrelated observations and presumably would not be a predictor of interest if the REP=1 observations do not have some meaningful distinction from REP=2 observations. But as I noted, the correlation found by the GEE analysis among the observations on a row seems to be essentially nil, so it could be argued that neither a GEE nor a mixed model is needed.

Joshua42 · Posted 09-26-2025 06:48 PM

If the coloration is essentially nil and GEE nor mixed model is not necessarily needed could I just analyze each week individually? I've considered running an ANOVA by DAT(week). I tried it using Proc Glimmix. This is the code I used.

PROC GLIMMIX data= fallphysiology
PLOTS=PEARSONPANEL;
by DAT;
title "A";
CLASS Rep Light Ncon Var;
MODEL A = Light Ncon Var Light*Ncon Light*Var Ncon*Var Light*Ncon*Var/DDFM=Satterthwaite;
RANDOM Rep(Light);
lsmeans Light/pdiff lines adjust=Scheffe;
lsmeans Ncon/pdiff lines adjust=Scheffe;
lsmeans Var/pdiff lines adjust=Scheffe;
LSMEANS Light*Ncon/SLICEDIFF=Ncon SLICEDIFF=Light lines ADJUST=SCHEFFE PLOTS=MEANPLOT(SLICEBY=Ncon JOIN);
LSMEANS Ncon*Var/SLICEDIFF=Var SLICEDIFF=Ncon lines ADJUST=SCHEFFE PLOTS=MEANPLOT(SLICEBY=Var JOIN);
RUN;

If I don't necessarily need a mixed model what other model/code would you recommend?

StatDave · Posted 09-27-2025 03:41 PM

You could, of course, do separate analyses by week, but then you would not be able to properly assess the effect of time if that is of interest. I've already provided code that shows one way (using GENMOD) to fit a model ignoring correlation and provides tests and plots of the effects of the variables of interest. That code treats DAT as a continuous variable and assesses its linear effect which simplifies the model by eliminating the many additional parameters in the DAT main effect and interactions that are required when DAT is a categorical, CLASS variable. At least there is no 4-way interaction.

Joshua42 · Posted 09-30-2025 12:59 PM

Thank you so much for your help so far. I used the genmod code on A in my data. I see it's giving chi-square test statistics and p-values. I do want to make inferences about the effect by L and N but I'm trying to determine if those treatments have a significant effect on these measurements. I've used ANOVAs for other data like yield for this experiment and the literature I've seen uses F statistics/ANOVA to present this type of data.

What other non mixed model would you suggest to run an ANOVA?

If I just analyze the variance within each week can't I just drop the DAT variable from the model completely?

Is my glimmix code reasonable?

StatDave · Posted 09-30-2025 02:08 PM

F-tests and ANOVA are the result from fitting the model using ordinary least squares as can be done with PROC GLM or the more modern and capable PROC ORTHOREG. The GENMOD analysis is the same thing but done using maximum likelihood estimation (MIXED and GLIMMIX also use maximum likelihood estimation). So, these are basically the same analysis just using two different estimation methods. Assuming the response is normally distributed, they become equivalent as sample sizes increase. And, as I said, you could certainly do separate analyses by week if you so decide, but then you'll have five different assessments of the effects of L, N, and V to reconcile and you won't have an assessment of the effect of week.

Note that F tests can be produced for the Type3 in GENMOD by specifying either the DSCALE or PSCALE option along with the TYPE3 option in the MODEL statement. See "F statistics" in the Details section of the GENMOD documentation.

Joshua42 · Posted 09-30-2025 04:39 PM

What would be the difference between using GENMOD and using GLIMMIX? Is it just that GLIMMIX would be a mixed model and GENMOD would not? Would my GLIMMIX code be reasonable to analyze each week separately?

Registration is open