Programming the statistical procedures from SAS

high leverage point has lower variance

Accepted Solution Solved
Reply
Contributor
Posts: 37
Accepted Solution

high leverage point has lower variance

Dear Experts,

 

I read that high leverage point has lower variance. Why is it so? From the formula Var(ei) = s*sqrt(1-hi), I can understand. But I thought points at extreme end tends to influence the regression line alot. Hence intuitively, the variance should be higher.

 

Thank you

L


Accepted Solutions
Solution
2 weeks ago
SAS Super FREQ
Posts: 3,475

Re: high leverage point has lower variance

Thanks for the references. The confusion was that your post title indicates that the "high-leverage point has lower variance," but the point has zero variance (it is a data point, not a statistic). 

 

The residual for an observation does have variance, which you could estimate by using a bootstrap. I think the answer to your question is that a high-leverage point "pulls up" the OLS regression line towards the y value at that point. Therefore the predicted mean at the high-leverage point is biased to be closer to the observed response at that point. Consequently, that residual will be biased to be small.

 

In short, a high-leverage point (by definition) shrinks the residual value at that point.

View solution in original post


All Replies
Super User
Posts: 10,466

Re: high leverage point has lower variance

Please site a source for the claim that a high leverage point has low variance. My basic stats tell me that no single point has variance as that is a statistic for a number of data points.

Contributor
Posts: 37

Re: high leverage point has lower variance

https://en.wikipedia.org/wiki/Studentized_residual

>> ... the residuals, unlike the errors, do not all have the same variance: the variance decreases as the corresponding x-value gets farther from the average x-value ...

 

http://www.stat.cmu.edu/~cshalizi/mreg/15/lectures/20/lecture-20.pdf

>> pg 10. The bigger the leverage of i, the smaller the variance of the residual there.

 

Hence Leverage is the distance of xi away from average of x. So high leverage, the smaller the variance of residual. As I mentioned earlier, why is it so? Intuitively, points at extreme ends will move the regression line alot and so the variance should be larger compared to the points near x average.

Super User
Posts: 10,466

Re: high leverage point has lower variance


CheerfulChu wrote:

https://en.wikipedia.org/wiki/Studentized_residual

>> ... the residuals, unlike the errors, do not all have the same variance: the variance decreases as the corresponding x-value gets farther from the average x-value ...

 

http://www.stat.cmu.edu/~cshalizi/mreg/15/lectures/20/lecture-20.pdf

>> pg 10. The bigger the leverage of i, the smaller the variance of the residual there.

 

Hence Leverage is the distance of xi away from average of x. So high leverage, the smaller the variance of residual. As I mentioned earlier, why is it so? Intuitively, points at extreme ends will move the regression line alot and so the variance should be larger compared to the points near x average.


Here's a brief of some of what they are talking about with that variance.

 

External studentization uses an estimate of $\mr{Var}[\widetilde{e}_ i]$ that does not involve the ith observation. Externally studentized residuals are often preferred over internally studentized residuals because they have well-known distributional properties in standard linear models for independent data.

 

So the studentized variance for Xi is for all of the other points EXCEPT the "high leverage point'. Which is why it is smaller.

 

 

Trusted Advisor
Posts: 1,607

Re: high leverage point has lower variance

Variance of what — of the predicted value? it would be nice if you made that clear.

 

Points at the extremes of the data don't necessarily influence the regression line a lot; they can, but they don't always do so.

Solution
2 weeks ago
SAS Super FREQ
Posts: 3,475

Re: high leverage point has lower variance

Thanks for the references. The confusion was that your post title indicates that the "high-leverage point has lower variance," but the point has zero variance (it is a data point, not a statistic). 

 

The residual for an observation does have variance, which you could estimate by using a bootstrap. I think the answer to your question is that a high-leverage point "pulls up" the OLS regression line towards the y value at that point. Therefore the predicted mean at the high-leverage point is biased to be closer to the observed response at that point. Consequently, that residual will be biased to be small.

 

In short, a high-leverage point (by definition) shrinks the residual value at that point.

Contributor
Posts: 37

Re: high leverage point has lower variance

You are good! Thanks
☑ This topic is SOLVED.

Need further help from the community? Please ask a new question.

Discussion stats
  • 6 replies
  • 128 views
  • 0 likes
  • 4 in conversation