BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
adrfinance
Obsidian | Level 7

I came across a regression model in SAS which does not use an intercept (constant term), and I would like to see whether the model violates the OLS assumption of zero sum of residuals. How can I view if that is the case in SAS?

1 ACCEPTED SOLUTION

Accepted Solutions
PaigeMiller
Diamond | Level 26

This is not an output of any PROC (that I know of). You could save the residuals to an output data set and sum them yourself.

 

When you don't use an intercept, the residuals (usually) will NOT usually sum to zero.

 

I would like to see whether the model violates the OLS assumption of zero sum of residuals

Whether or not the residuals sum to zero in this case, this DOES NOT violate the assumptions of OLS. This is not an assumption of OLS.

 

A better question to ask when you see a regression with no intercept is: what is the logical justification for not using an intercept? That would be a violation of appropriate model building rules if there is no logical justification for this, and a more serious problem (in my mind) than the residuals not summing to zero (which in my mind is meaningless and not worth bothering over).

--
Paige Miller

View solution in original post

11 REPLIES 11
PaigeMiller
Diamond | Level 26

This is not an output of any PROC (that I know of). You could save the residuals to an output data set and sum them yourself.

 

When you don't use an intercept, the residuals (usually) will NOT usually sum to zero.

 

I would like to see whether the model violates the OLS assumption of zero sum of residuals

Whether or not the residuals sum to zero in this case, this DOES NOT violate the assumptions of OLS. This is not an assumption of OLS.

 

A better question to ask when you see a regression with no intercept is: what is the logical justification for not using an intercept? That would be a violation of appropriate model building rules if there is no logical justification for this, and a more serious problem (in my mind) than the residuals not summing to zero (which in my mind is meaningless and not worth bothering over).

--
Paige Miller
adrfinance
Obsidian | Level 7
the variables are standardized to have zero mean and standard deviation 1. So I assume it is ok not to have an intercept. Right or not?
PaigeMiller
Diamond | Level 26

@adrfinance wrote:
the variables are standardized to have zero mean and standard deviation 1. So I assume it is ok not to have an intercept. Right or not?

The justification for not having an intercept in this case is fine. Of course, if you back-transform the data to the original scale, there is an intercept.

 

If you hadn't standardized the data, then not having an intercept requires justification.

--
Paige Miller
adrfinance
Obsidian | Level 7
Yes I came across some models where they standardize all the variables and then they run regressions without intercepts. I assume they do these standardizations (0 mean 1 std) in order to be able to run regressions without intercepts in the first place....
PaigeMiller
Diamond | Level 26

@adrfinance wrote:
Yes I came across some models where they standardize all the variables and then they run regressions without intercepts. I assume they do these standardizations (0 mean 1 std) in order to be able to run regressions without intercepts in the first place....

My assumption would be that they do the standardizing so that regression coefficients can be compared to one another. I doubt that wanting to run a regression without an intercept is a valid reason for standardizing.

--
Paige Miller
adrfinance
Obsidian | Level 7
Yes you might be right but I see many models from them without intercepts so I just made that out. Maybe I am wrong though. I always thought that running a model without an intercept is wrong, didn't know about the fact that a zero mean variable would be fine to be used in a regression without a constant term.
ballardw
Super User

@adrfinance wrote:
Yes you might be right but I see many models from them without intercepts so I just made that out. Maybe I am wrong though. I always thought that running a model without an intercept is wrong, didn't know about the fact that a zero mean variable would be fine to be used in a regression without a constant term.

There are fields of study that have their own practices for whatever historical reason. So that might be what you are seeing. Sometimes these things can become so  embedded in the "culture" that the reasons are lost in the mists of history.

 

I have done OLS regression by hand, meaning pencil and paper. (Admittedly not a lot of records but still...) I can see anything that reduces the steps being considered a "good thing". The process may have survived into the time when computers and software are more flexible just because that was the way it was done years ago.

mkeintz
PROC Star

The "LS" in OLS is least squares.   That is, the OLS algorithm minimizes the sum of squared residuals, which need not even minimize the sum of unsquared residuals, much less set that sum to zero.

 

Oops, as @PaigeMiller and @adrfinance pointed out, I need to review my recollection of OLS.  Thanks to both.  I've editted this post just in case someone doesn't notice the responses.

 

 

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------
PaigeMiller
Diamond | Level 26

Unless you have removed the intercept from the model, the math forces the sum of un-squared residuals to be zero.

--
Paige Miller
mkeintz
PROC Star

I think you're right.

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------

Ready to join fellow brilliant minds for the SAS Hackathon?

Build your skills. Make connections. Enjoy creative freedom. Maybe change the world. Registration is now open through August 30th. Visit the SAS Hackathon homepage.

Register today!
What is ANOVA?

ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 11 replies
  • 1810 views
  • 6 likes
  • 4 in conversation