I came across a regression model in SAS which does not use an intercept (constant term), and I would like to see whether the model violates the OLS assumption of zero sum of residuals. How can I view if that is the case in SAS?
This is not an output of any PROC (that I know of). You could save the residuals to an output data set and sum them yourself.
When you don't use an intercept, the residuals (usually) will NOT usually sum to zero.
I would like to see whether the model violates the OLS assumption of zero sum of residuals
Whether or not the residuals sum to zero in this case, this DOES NOT violate the assumptions of OLS. This is not an assumption of OLS.
A better question to ask when you see a regression with no intercept is: what is the logical justification for not using an intercept? That would be a violation of appropriate model building rules if there is no logical justification for this, and a more serious problem (in my mind) than the residuals not summing to zero (which in my mind is meaningless and not worth bothering over).
This is not an output of any PROC (that I know of). You could save the residuals to an output data set and sum them yourself.
When you don't use an intercept, the residuals (usually) will NOT usually sum to zero.
I would like to see whether the model violates the OLS assumption of zero sum of residuals
Whether or not the residuals sum to zero in this case, this DOES NOT violate the assumptions of OLS. This is not an assumption of OLS.
A better question to ask when you see a regression with no intercept is: what is the logical justification for not using an intercept? That would be a violation of appropriate model building rules if there is no logical justification for this, and a more serious problem (in my mind) than the residuals not summing to zero (which in my mind is meaningless and not worth bothering over).
@adrfinance wrote:
the variables are standardized to have zero mean and standard deviation 1. So I assume it is ok not to have an intercept. Right or not?
The justification for not having an intercept in this case is fine. Of course, if you back-transform the data to the original scale, there is an intercept.
If you hadn't standardized the data, then not having an intercept requires justification.
@adrfinance wrote:
Yes I came across some models where they standardize all the variables and then they run regressions without intercepts. I assume they do these standardizations (0 mean 1 std) in order to be able to run regressions without intercepts in the first place....
My assumption would be that they do the standardizing so that regression coefficients can be compared to one another. I doubt that wanting to run a regression without an intercept is a valid reason for standardizing.
@adrfinance wrote:
Yes you might be right but I see many models from them without intercepts so I just made that out. Maybe I am wrong though. I always thought that running a model without an intercept is wrong, didn't know about the fact that a zero mean variable would be fine to be used in a regression without a constant term.
There are fields of study that have their own practices for whatever historical reason. So that might be what you are seeing. Sometimes these things can become so embedded in the "culture" that the reasons are lost in the mists of history.
I have done OLS regression by hand, meaning pencil and paper. (Admittedly not a lot of records but still...) I can see anything that reduces the steps being considered a "good thing". The process may have survived into the time when computers and software are more flexible just because that was the way it was done years ago.
The "LS" in OLS is least squares. That is, the OLS algorithm minimizes the sum of squared residuals, which need not even minimize the sum of unsquared residuals, much less set that sum to zero.
Oops, as @PaigeMiller and @adrfinance pointed out, I need to review my recollection of OLS. Thanks to both. I've editted this post just in case someone doesn't notice the responses.
Unless you have removed the intercept from the model, the math forces the sum of un-squared residuals to be zero.
I think you're right.
Build your skills. Make connections. Enjoy creative freedom. Maybe change the world. Registration is now open through August 30th. Visit the SAS Hackathon homepage.
Register today!ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.
Find more tutorials on the SAS Users YouTube channel.