BookmarkSubscribeRSS Feed
Lauren_Hanna
Calcite | Level 5

Hi - I have a dataset on cattle where we captured the amount of time each animal ate food (in minutes). We have several class effects to include (breed, size, year of study, and day relative to estrus). Running this model in PROC MIXED "as is" suggests the residuals are not normally distributed. Comparing square root and log transformations in PROC MIXED suggest that square root transformation is a better option. When I run the exact same model in PROC GLIMMIX so I can get inverse link means and standard errors, the outcomes are not the same (with or without nloptions in code provided). PROC GLIMMIX almost looks like PROC MIXED when the variable was not transformed. I'm not sure why and I am hoping experts here can help explain. I've attached SAS output of these scenarios to demonstrate. Code used includes:

*TIME, Min - no transformation;
proc mixed data= nobullestrusC plots = all;
class HeiferID Breed FSGrp Year DRE;
model TimeMin = Year Date Breed|DRE FSGrp|DRE / ddfm=kr;
repeated DRE / subject=HeiferID type=csh;
run;

*TIME, Min - log function transformed;
proc mixed data= nobullestrusC plots = all;
class HeiferID Breed FSGrp Year DRE;
model logTmin = Year Date Breed|DRE FSGrp|DRE / ddfm=kr;
repeated DRE / subject=HeiferID type=csh;
run;

*TIME, Min - square root function transformed;
proc mixed data= nobullestrusC plots = all;
class HeiferID Breed FSGrp Year DRE;
model sqrtTmin = Year Date Breed|DRE FSGrp|DRE / ddfm=kr;
repeated DRE / subject=HeiferID type=csh;
run;

*Time, Min - square root transformation in glimmix;
proc glimmix data = nobullestrusC plots=all;
nloptions technique = NRRIDG;
class HeiferID Breed FSGrp Year DRE;
model TimeMin = Year Date Breed|DRE FSGrp|DRE / ddfm=kr link = power(0.5);
random DRE / subject = HeiferID type = CSH residual; 
run;

I'll also note that the covariance structure type was limited going across procedures, but CSH was the best fit of ones that worked across both. Thank you for help here!

7 REPLIES 7
jiltao
SAS Super FREQ

The MIXED approach is using the square root of y as the response variable; the PROC GLIMMIX approach is modeling the square root of mean (mu), not y. The two models are different.

 

Thanks,

Jill

Lauren_Hanna
Calcite | Level 5

Thank you Jill. How do I need to change the PROC GLIMMIX code so that it mirrors the PROC MIXED version? Is that possible? I thought it was from some other SAS Community posts I read, but perhaps I am mistaken.

jiltao
SAS Super FREQ

You would need to use the same transformed dependent variable in PROC GLIMMIX as you did in PROC MIXED in order to get the same result in PROC GLIMMIX. For example,

 

proc glimmix data= nobullestrusC plots = all;
class HeiferID Breed FSGrp Year DRE;
model sqrtTmin = Year Date Breed|DRE FSGrp|DRE / ddfm=kr;
random DRE / subject=HeiferID type=csh residual;
run;
Lauren_Hanna
Calcite | Level 5

Thank you again Jill. This solution defeats the purpose of what I am trying to accomplish. I want to model the TIME variable since it has non-normal tendencies using PROC GLIMMIX so I can ensure modeling assumptions are met while also being able to find the inverse link of the predicted means and standard errors of fixed effects (i.e., having the predicted means/SE in TIME scale, not square root). Doing so in PROC GLIMMIX with the same type of transformation as the link does not improve that modeling effort (see original PDF). I have not come across a fit better than the sqrtTmin in PROC MIXED so far, but I cannot get back-transformed estimates that route. Trying to use a different distribution with that link = power(0.5) also poses problems. Do you have any recommendations in this case?  

jiltao
SAS Super FREQ

Then I am not sure why you wanted to "mirror" the PROC MIXED model. PROC MIXED assumes the response variable being normal.

You might use PROC GLIMMIX to fit a model, use DIST= to specify a distribution that might work better than normal for your data.

SteveDenham
Jade | Level 19

Elapsed times often result in a gamma distribution.

 

SteveDenham

Lauren_Hanna
Calcite | Level 5

Steve - Thank you for the thoughts there. I changed the distribution to Gamma with default link setting. The distributions of the residuals do not look different, but they are tighter around zero - most with the +/- 1 range. There were a couple above +1 (1.56 was highest), but I could not see anything in their raw form that would justify removing them. I think we can proceed with this fit and be comfortable with the outcomes.

 

I appreciate both of your comments here!

hackathon24-white-horiz.png

2025 SAS Hackathon: There is still time!

Good news: We've extended SAS Hackathon registration until Sept. 12, so you still have time to be part of our biggest event yet – our five-year anniversary!

Register Now

What is ANOVA?

ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 7 replies
  • 1787 views
  • 2 likes
  • 3 in conversation