BookmarkSubscribeRSS Feed
Quartz | Level 8
I’m performing a Multivariate regression and my residuals are not normally distributed. I think I need to perform a log y transformation but when doing this my residuals still aren’t normally distributed? Any tips on where I’ve gone wrong

That's a huge topic, but three common reasons are

Just FYI, you only need normality if you intend to use inferential statistics. The predicted values are valid regardless.

Diamond | Level 26

A fourth explanation for non-normal residuals is that the assumption of the errors being normally distributed is just plain wrong in this data.

Paige Miller
Super User

It never hurts to show the regression procedure code that you used.


That may give the folks like @PaigeMiller or @Rick_SAS some additional clues to look at. And maybe include some of the model diagnostics/summaries like numbers of observations and such.

Jade | Level 19

@ballardw  makes a great point.  If you are doing some sort of testing for normality, be aware that for large datasets even a minor deviation from normality will be found to be significant, and for smaller datasets, single points may lead to significance.  Remember that linear models are remarkably robust to the assumption of the normality of residuals.  Consequently, if you must do testing, set your alpha at a smaller than usual level, say 0.001.  Better to follow @Rick_SAS 's lead and examine plots of the residuals.



Ready to join fellow brilliant minds for the SAS Hackathon?

Build your skills. Make connections. Enjoy creative freedom. Maybe change the world. Registration is now open through August 30th. Visit the SAS Hackathon homepage.

Register today!
What is ANOVA?

ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 4 replies
  • 5 in conversation