I am modeling the proportion of white blood cells (TNCC) that are neutrophils (PMN) recovered in bronchoalveolar lavage fluid as a function of diet (forage, 2 levels) at 4 time points using the following code:
proc glimmix data=BAL plots=all;
class number forage week;
model freq_PMN/TNCC = forage week week*forage/solution link=logit;
random_residual_/subject=number type=ar(1);
run;
This model has worked well for us in numerous experiments and does so here as well. In this instance the sample size is small with only 7 individuals as this was run as a pilot study. I am being queried about checking the assumption that random effects are normally distributed. I usually state that residual plots are visually examined to check appropriate designation of distribution/link function. (In this case they look good) But I guess I need something to specifically address this reviewer's concern about the distribution of the random effect (repeated measure). I have searched this topic in general, and most publications state that this is difficult, and depending on the article, either has important implications or doesn't. I would appreciate any guidance!
Ordinarily, I would include week in the random residual statement, and find this approach interesting.
But that doesn't answer the question. So... Since this is modeling R side covariances, and there is a logit link, I don't see how you can satisfy the reviewer. In the logit space, the residual should be approximately normal, so maybe you can extract the residuals and do a QQ plot. Except that these aren't really residuals, but are BLUPs. You could also output variances, although what to do with them is an issue Testing for normality could also be done, but that requires that the residuals be independent, and these may be clustered by subject. A nonparametric (in the distribution-free sense) test might work (KS? Cramer-von MIses?). Stroup's text doesn't go into this issue, so I am winging it here. I trust QQ plots more than I trust normality tests, so that would be my choice. It is just getting the right things to plot.
SteveDenham
Thanks for your reply Steve. Regarding the random statement, how would you include week in the random statement and still specify r-side effects? As far as the "mixed" portion of my model, all I want to do is account for the repeated measures aspect of the design, and not spend much time looking at the statistical significance of this random effect. This seems to be a speed bump for some reviewers and in future I think I will describe my model as a repeated-measure generalized linear model.
I am still a bit stuck on how to respond in this case as I don't believe that I have a serious problem with the model I have specified.
Thanks for your help!
This is how I usually write an R side covariance:
proc glimmix data=BAL plots=all;
class number forage week;
model freq_PMN/TNCC = forage week week*forage/solution link=logit;
random week/subject=number type=ar(1) residual;
run;
You might consider an AR+RE model as well:
proc glimmix data=BAL plots=all;
class number forage week;
model freq_PMN/TNCC = forage week week*forage/solution link=logit;
random intercept/subject=number;
random week/subject=number type=ar(1) residual;
run;
Good luck with the reviewer.
SteveDenham
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.
Find more tutorials on the SAS Users YouTube channel.