turn on suggestions

Auto-suggest helps you quickly narrow down your search results by suggesting possible matches as you type.

Showing results for

Find a Community

- Home
- /
- Analytics
- /
- Stat Procs
- /
- Weighted linear regression

Topic Options

- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

08-14-2016 12:22 PM

I have data for which I did a regression and the White test for constant variance had a p value =0.0016 indicating heterosdedasticity of the variance. See attached graph for residuals vs predicted value. The data was normally distributed see atached distribution graph which had a p=0.79 for the Shapiro Wilk test. These results indicate that I need to do a weighted regression.

proc reg; /* weighted linear regression */

model y = x;

weight w;

In the literature I read, "

If however we know the noise variance σ 2 i at each measurement i, and set wi = 1/σ2 i , we get the heteroskedastic MLE, and recover efficiency."

My question is how do we know this weight value and based upon my data what would be an appropriate weight i.e., 1/y or something else?

Accepted Solutions

Solution

08-19-2016
02:52 PM

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

08-15-2016 10:19 AM

The optimal weight values are unknown.

I can think of three options, listed from easiest to most difficult:

1) Transform the response variable by applying a variance-stabilizing transformation. A typical transformation is to define LogY = log(y) and then model LogY as a function of X. This would require that Y > 0 for your response variable, but there are ways to handle negative values, too.

2) Use robust regression, especially M-estimation by using PROC ROBUSTREG, if you think that your response variable has been contaminated by outliers.

3) Implement iteratively reqeighted least squares regression by using PROC NLIN

Since (1) is easy and is commonly done in practice, I would suggest that you start there. Other variance stabilizing transformations include sqrt(Y) and 1/Y. You should use the one that makes the most intuitive sense for your data.

All Replies

Solution

08-19-2016
02:52 PM

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

08-15-2016 10:19 AM

The optimal weight values are unknown.

I can think of three options, listed from easiest to most difficult:

1) Transform the response variable by applying a variance-stabilizing transformation. A typical transformation is to define LogY = log(y) and then model LogY as a function of X. This would require that Y > 0 for your response variable, but there are ways to handle negative values, too.

2) Use robust regression, especially M-estimation by using PROC ROBUSTREG, if you think that your response variable has been contaminated by outliers.

3) Implement iteratively reqeighted least squares regression by using PROC NLIN

Since (1) is easy and is commonly done in practice, I would suggest that you start there. Other variance stabilizing transformations include sqrt(Y) and 1/Y. You should use the one that makes the most intuitive sense for your data.

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

08-15-2016 11:01 AM

I will try each of the suggested options and see which works best.

Thanks for the advice.