BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
mandkhoie
Calcite | Level 5

Hello,

I am a new SAS user, stats is not my expertise and I need help. I am trying to conduct model diagnostics in GLIMMIX to check the normality of residuals and extreme outliers at the highest level. I am checking these assumptions at the highest levels because some sources said residuals at the lowest levels are not very informative. I have a two-level model. I am able to get the "solution for random effects" at the highest level using the codes below. Since I am using LAPLACE, according to the SAS manual, these are the "empirical Bayes estimates (EBE) of ." or I am gonna call them Estimates. I get all these Estimates in my ODS OUTPUT command by putting SOLUTIONR. Can I treat these Estimates as residuals of the highest level? If yes then I can check whether these "residuals" are normal. Also, if that is case then how do I standardize these Estimates so I could check for extreme outliers at the highest level. Thanks.

proc glimmix data=final method=LAPLACE;

   class group X1 X2 ;

   model outcome = X1 X2 X1|X2/

       link=logit d=binomial s;

random int/ sub=group s;

   covtest 0;

ods output solutionr=groupLevel;

run;

1 ACCEPTED SOLUTION

Accepted Solutions
lvm
Rhodochrosite | Level 12 lvm
Rhodochrosite | Level 12

Actually, the predictions of random effects (i.e., EBLUPS) sometimes are called hierarchical "residuals". Page 454-455 in Stroup (2013) considers them a type of residual. In principle, one could assess whether they are normally distributed, etc. However, the formula to calculate EBLUPS assumes normality; thus, the typical ways of diagnosing (regular) residuals will not work that well. They may look more normal than they really are. Fortunately, the linear mixed model and the generalized linear mixed models are fairly robust to nonnormality of the random effects. Chuck McCulloch and others have investigated this.

View solution in original post

4 REPLIES 4
SteveDenham
Jade | Level 19

The "estimates" are NOT residuals, but are best linear unbiased predictions (BLUPs).  You can get residuals by subtracting the fixed effect means, but realize that the difference accounts for all levels of random effects.  Use of the OUTPUT statement can give you the predicted and residual values on the logit scale or the original scale depending on the use of the ILINK option.

But WHY are you checking for normality of residuals--you are assuming a binomial distribution.  There is no need whatsoever to check for normality, and if you happened to find it, it would be spurious anyway.  Your values are bounded above and below, and thus residuals will also be bounded, and will NOT be normal, at least on the original scale.  They may be normal on the logit scale, but again, why bother?  You don't have to have normally distributed residuals as an assumption.  They only have to be IID - independent and identically distributed.

I don't know what version of SAS/STAT you are using, but try adding a PLOTS option to your PROC GLIMMIX statement.  That will help in detecting extreme values.

Steve Denham

lvm
Rhodochrosite | Level 12 lvm
Rhodochrosite | Level 12

Actually, the predictions of random effects (i.e., EBLUPS) sometimes are called hierarchical "residuals". Page 454-455 in Stroup (2013) considers them a type of residual. In principle, one could assess whether they are normally distributed, etc. However, the formula to calculate EBLUPS assumes normality; thus, the typical ways of diagnosing (regular) residuals will not work that well. They may look more normal than they really are. Fortunately, the linear mixed model and the generalized linear mixed models are fairly robust to nonnormality of the random effects. Chuck McCulloch and others have investigated this.

lvm
Rhodochrosite | Level 12 lvm
Rhodochrosite | Level 12

For your information....

You can also get the EBLUPS in the output file for observations by using an OUTPUT statement together with an ID statement.

proc glimmix data=final method=LAPLACE;

   class group X1 X2 ;

   model outcome = X1 X2 X1|X2/

       link=logit d=binomial s;

random int/ sub=group s;

   covtest 0;

ods output solutionr=groupLevel;

output out=out pred=p resid=r ;

id _zgamma_;

run;


Here, _zgamma_ is the syntax for Zgamma term in the mixed model (the random effect, EBLUP). You can then plot these with SGPLOT or other procedure, and look at distributional properties with UNIVARIATE.

mandkhoie
Calcite | Level 5

Thank you Steve and lvm, really appreciate your inputs. I was looking for what lvm said, thanks for providing the sources and codes!

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

What is ANOVA?

ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 4 replies
  • 2270 views
  • 8 likes
  • 3 in conversation