Statistical Procedures

laurenhosking · Posted 11-19-2020 10:02 AM

I’m performing a Multivariate regression and my residuals are not normally distributed. I think I need to perform a log y transformation but when doing this my residuals still aren’t normally distributed? Any tips on where I’ve gone wrong

Rick_SAS · Posted 11-19-2020 11:01 AM

That's a huge topic, but three common reasons are

A misspecified model. The residuals show a systematic trend, such as a quadratic effect that might need to be included in the model.
Heteroscedasticity often appears as a "fan-shaped" plot in which the size of the residuals tend to be small on one side of the plot and large on the other.
Correlated errors show up as a sequence of consecutive high or low values, rather than a "random scatter" of points.

Just FYI, you only need normality if you intend to use inferential statistics. The predicted values are valid regardless.

PaigeMiller · Posted 11-19-2020 11:05 AM

A fourth explanation for non-normal residuals is that the assumption of the errors being normally distributed is just plain wrong in this data.

--
Paige Miller

ballardw · Posted 11-19-2020 11:40 AM

It never hurts to show the regression procedure code that you used.

That may give the folks like @PaigeMiller or @Rick_SAS some additional clues to look at. And maybe include some of the model diagnostics/summaries like numbers of observations and such.

SteveDenham · Posted 11-20-2020 07:51 AM

@ballardw makes a great point. If you are doing some sort of testing for normality, be aware that for large datasets even a minor deviation from normality will be found to be significant, and for smaller datasets, single points may lead to significance. Remember that linear models are remarkably robust to the assumption of the normality of residuals. Consequently, if you must do testing, set your alpha at a smaller than usual level, say 0.001. Better to follow @Rick_SAS 's lead and examine plots of the residuals.

SteveDenham

Statistical Procedures

Residuals not normally distributed

Re: Residuals not normally distributed

Re: Residuals not normally distributed

Re: Residuals not normally distributed

Re: Residuals not normally distributed

Appropriate model for non-normal distribution

[SAS 활용 노하우] 정규분포(Normal Distribution)

Integrating over a non-normal distribution

Normalize your vectors!

data transformation for normal distribution

Follow Us

What is...

Statistical Procedures

Our biggest data and AI event of the year.

Follow Us

What is...