Hi community, I have to analize data from 3 years with 4 treatments and two repetitions on lambs body weight with a control and three protein levels of suplementation. Each experiment (year) has body weight taken aproximately every week. I think I can analize this data using glimmix assuming a normal distribution.
I don't know how to prepare the data to take into account the year as random effect. I mean: If I use the day after begin each experiment (in univariate format) I have day 0 - day 7- day 14 - day 21 and so on, but initial weights are different, so the day 21 of one year could be very different to the day 21 on another year.
So:
1) Maybe is better to format weights in multivariate format and analize final weight with initial weight as covariable? Or how can I arrange data on univariate format to take account of this?
2) Should be the same if I calculate weight gain for each week and maybe I forget about initial weights ?
3) Can somebody give me the sentences in an example with similar data?
Thanks in advance
I'll note a couple of considerations about using initial weight as a covariate to augment Steve's fine solution.
First, if you have several weeks worth of data, then you could consider regressing the response (weight or weight gain) on week. Because each lamb has its own regression, you would use a random coefficients model which allows variance among intercepts (equivalent to blocking on lamb), variance among slopes (or other regression parameters), and potentially covariance between intercepts and slopes. I would not necessarily expect the regression to be linear on the original scales, so you would want to plot the response against week for each lamb to get a visual assessment of the form of the relationship and how much variability you have in th growth profiles among lambs and among diets. I usually don't bother modeling the covariance structure of the residuals for a random coefficients model because it is the variance among lambs that matters for tests of diet, not the variance within. Other people would disagree with me on this point. If you are using week as a classification variable, then you definitely will need to address the covariance structure among the repeated measurements, as Steve illustrates in his code. (Note that "residualtype=" is "residual type=" with a space).
Second, from the plots I suggest above, you can also get a visual sense of how the response might be affected by initial weight. An even better visual would be plots of weight on week i (where i = 1 to last) for all lambs. You might see a strong relationship between initial weight and weight in the early weeks that subsequently becomes weaker as time passes. Or no relationship at all. Or something else entirely. Again, you need to assess the form of the relationship: Is it linear (as specified in Steve's model), or not? If you need to transform weight (or use a generalized linear mixed model) to meet assumptions is the relationship linear on that scale? Does the relationship appear to vary among diets (suggesting an interaction of initial weight and diet), or among weeks (interaction of initial weight and week) or among years? With only two replicates, you probably will need to keep your model as parsimonious as possible to avoid overfitting.
I'm a believer in a statement that Frank Harrell made in the preface of his book on Regression Modeling Strategies. I don't have it with me so I'll paraphrase: It's almost as dangerous to consider the data when modeling as not doing so. Patterns that you see in the data can inform modeling decisions, but of course, you do need to be objectively careful.
It may help to provde some examples of your data to see what you have and a fuller description of each variable.
Also the purpose of the analysis as it is entirely too easy to include things that aren't needed in some forms or too use them in a suboptimal manner.
Things to think about.
Define the Day 0 weight as the weight when the animals are assigned to treatment. It should be taken before they have their first offered feed on the respective diets.
Define all days relative to Day 0 ( i thinkyou have done this).
Put the data in long form, with the Day 0 weight for a given year as a covariate.
Then the following should work:
proc glimmix data=yourdata;
class diet year studyday animal;
model weight=diet studyday diet*studyday cov;
random intercept animal/subject=year;
random studyday / residualtype=ar(1) subject=animal*year;
lsmeans diet/diff=control;
lsmeans diet*studyday/slicediff=studyday slicedifftype=control;
run;
Steve Denham
Great Steve, thanks a lot for your support!
The only think I see is that with this procedure you make a "mix" of the 3 years studied, isn't it? Is like you build one growth curve for each diet, and I don't know i this is correct because the initial weight in each year is different. Even when you use the initial weight as covariate, each studydate you have a kind of "average" of different weights, isn't it?
Can I ask for the interaction year*diet*studyday to solve this or this is not correct?
Thanks again
By using the Day 0 weight for each year as the covariate, you are removing the "different values" over year, so that you get a single trajectory per diet. With year as a random effect, it only adds variability to the values, not a level amount. If you wish to think of it as a level amount, then you should treat year as a fixed effect (and if you were asking this question in the R communities, I think most of the active posters would recommend that). However, I think year only adds variability, and the broader inference space of possible years makes more sense to me.
So, no 3 way interaction.
You can get BLUPs for the years by adding the solution option to the random statement:
random intercept animal/subject=year solution;
Steve Denham
Thanks! Great answer, huge help!
I'll note a couple of considerations about using initial weight as a covariate to augment Steve's fine solution.
First, if you have several weeks worth of data, then you could consider regressing the response (weight or weight gain) on week. Because each lamb has its own regression, you would use a random coefficients model which allows variance among intercepts (equivalent to blocking on lamb), variance among slopes (or other regression parameters), and potentially covariance between intercepts and slopes. I would not necessarily expect the regression to be linear on the original scales, so you would want to plot the response against week for each lamb to get a visual assessment of the form of the relationship and how much variability you have in th growth profiles among lambs and among diets. I usually don't bother modeling the covariance structure of the residuals for a random coefficients model because it is the variance among lambs that matters for tests of diet, not the variance within. Other people would disagree with me on this point. If you are using week as a classification variable, then you definitely will need to address the covariance structure among the repeated measurements, as Steve illustrates in his code. (Note that "residualtype=" is "residual type=" with a space).
Second, from the plots I suggest above, you can also get a visual sense of how the response might be affected by initial weight. An even better visual would be plots of weight on week i (where i = 1 to last) for all lambs. You might see a strong relationship between initial weight and weight in the early weeks that subsequently becomes weaker as time passes. Or no relationship at all. Or something else entirely. Again, you need to assess the form of the relationship: Is it linear (as specified in Steve's model), or not? If you need to transform weight (or use a generalized linear mixed model) to meet assumptions is the relationship linear on that scale? Does the relationship appear to vary among diets (suggesting an interaction of initial weight and diet), or among weeks (interaction of initial weight and week) or among years? With only two replicates, you probably will need to keep your model as parsimonious as possible to avoid overfitting.
I'm a believer in a statement that Frank Harrell made in the preface of his book on Regression Modeling Strategies. I don't have it with me so I'll paraphrase: It's almost as dangerous to consider the data when modeling as not doing so. Patterns that you see in the data can inform modeling decisions, but of course, you do need to be objectively careful.
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.
Find more tutorials on the SAS Users YouTube channel.