Calcite | Level 5

## Proc Mixed- Normality test for random effects

Hello,

Is there anyway to test for normality of just the random effects on Proc Mixed other than the graphs, such as a shapiro-wilk test? i.e is there anyway to get a figure for normality?

1 ACCEPTED SOLUTION

Accepted Solutions

## Re: Proc Mixed- Normality test for random effects

My first question is to ask "Why do this?".  All of the estimation methods in linear mixed models are based on the assumption that  the variances (random effects) have a mean of zero, a positive variance, and perhaps some sort of covariance with the other random effects.  So why test? And especially why test with any of the readily available tests for normality which are overpowered for larger sample sizes and underpowered for small sample sizes.  Over 50 years ago, George Box said something like, "To test for normality before analysing the data is akin to going out in a row boat to see if the ocean is safe for an ocean liner."  In particular, suppose you found that the variance component in question failed a normality test?  Without some graphical aid to identify what the difference was attributable to, you will be left without a method for analysis - If you can identify what is causing the deviation, you could maybe add an additional factor or grouping to avoid the issue.

So if you truly wanted to do something in this area, you will need to get the blups for every record.  You can do this with an OUTPUT statement in GLIMMIX without too much difficulty.- get the default linear predictor and subtract the marginal linear predictor.  It is a bit different in MIXED - specify the OUTP= option in the MODEL statement, and calculate the difference between the marginal raw residual and the conditional raw residual.

Now you have some aggregated variability estimate - all of the random effects including the residual error.  You can partition it based on the relative size of the variance components (sort of like an intraclass correlation, but not quite), but that is based on assuming that the variances are additive, which in turn depends on the assumption of independence and identical scaled distributions.

So it is difficult, fraught with pitfalls, and liable to be misleading to "test" for normality of variance components.  And I haven't even touched on the issue of "what is the expected distribution of a variance estimator?" (ans. by Cochran's theorem it is a scaled chi-squared distribution, not a normal distribution).

SteveDenham

2 REPLIES 2
SAS Super FREQ

Thanks,

Koen

## Re: Proc Mixed- Normality test for random effects

My first question is to ask "Why do this?".  All of the estimation methods in linear mixed models are based on the assumption that  the variances (random effects) have a mean of zero, a positive variance, and perhaps some sort of covariance with the other random effects.  So why test? And especially why test with any of the readily available tests for normality which are overpowered for larger sample sizes and underpowered for small sample sizes.  Over 50 years ago, George Box said something like, "To test for normality before analysing the data is akin to going out in a row boat to see if the ocean is safe for an ocean liner."  In particular, suppose you found that the variance component in question failed a normality test?  Without some graphical aid to identify what the difference was attributable to, you will be left without a method for analysis - If you can identify what is causing the deviation, you could maybe add an additional factor or grouping to avoid the issue.

So if you truly wanted to do something in this area, you will need to get the blups for every record.  You can do this with an OUTPUT statement in GLIMMIX without too much difficulty.- get the default linear predictor and subtract the marginal linear predictor.  It is a bit different in MIXED - specify the OUTP= option in the MODEL statement, and calculate the difference between the marginal raw residual and the conditional raw residual.

Now you have some aggregated variability estimate - all of the random effects including the residual error.  You can partition it based on the relative size of the variance components (sort of like an intraclass correlation, but not quite), but that is based on assuming that the variances are additive, which in turn depends on the assumption of independence and identical scaled distributions.

So it is difficult, fraught with pitfalls, and liable to be misleading to "test" for normality of variance components.  And I haven't even touched on the issue of "what is the expected distribution of a variance estimator?" (ans. by Cochran's theorem it is a scaled chi-squared distribution, not a normal distribution).

SteveDenham

Discussion stats
• 2 replies
• 834 views
• 6 likes
• 3 in conversation