It's more a stat problem, than an actual SAS-problem...
I've made a regression analysis using general linear model in SAS. I have made four models for the same association I want to explore, but with different number of covariates (confounders). When I check the model assumption for normality, I noticed that the distribution of the models residuals gets more and more normal distributed with rising number of covariates in the model.
Could anyone explain me why this happens? Preferably in a "not so mathematical way"? 🙂
Not only that, but the standard deviation of the residuals is getting smaller, too.
As you fit more variables, you are explaining more of the data. The model fits the data better, which means that the residuals are getting closer to the regression surface.
If you have one regressor, there might be observations that are far from the model. These "outliers" show up in the residual histogram as being far from the zero. Thus the histogram does not look bell-shaped. As you add more regressors, there are fewer outliers and the surface passes close to all the points. The histogram of residuals will be very bell-shaped and narrow (small standard deviation).
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.
Find more tutorials on the SAS Users YouTube channel.