turn on suggestions

Auto-suggest helps you quickly narrow down your search results by suggesting possible matches as you type.

Showing results for

Find a Community

- Home
- /
- SAS Programming
- /
- SAS Procedures
- /
- Weighted standard deviation using proc means

Topic Options

- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

02-25-2015 02:42 PM

I am running a means on weighted data in both SAS and Stata and getting wildly different values for standard deviation. The statistician here believes SAS is incorrect.

I took the class dataset from sashelp and created two fake weights. t_wt gives everyone a weight of 1 and t_wt2 gives everyone a weight of 5. When running means using each weight I had expected the standard deviation to remain the same as there is no variance in the means or data distribution. (and in Stata, the standard deviation does remain the same). However I am getting a shift in stddev from 22.77 to 50.92 for the weight variable and 5.12 to 11.46 for the height variable. We are having problems explaining why the results are different in SAS and Stata. Any thoughts?

data temp2;

set sashelp.class;

t_wt = 1;

t_wt2 = 5;

run;

proc means data = temp2 mean min max std n std;

var weight height;

weight t_wt;

run;

proc means data = temp2 mean min max std n std;

var weight height;

weight t_wt2;

run;

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

02-25-2015 04:18 PM

I think there's a note regarding this in the documentation.

Try using proc surveymeans instead.

EDIT: Look at the VARDEF= Options instead, which is the denominator for the variance/std calculation. The default is probably not what you want, most likely WGT or N instead.

data temp2;

set sashelp.class;

t_wt = 1;

t_wt2 = 5;

run;

proc means data = temp2 mean min max std n std vardef=WGT;

var weight height;

weight t_wt;

run;

proc means data = temp2 mean min max std n std vardef=WGT;

var weight height;

weight t_wt2;

run;

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

02-25-2015 05:10 PM

PROC MEANS calculates the variance as the sum[weight*(x-xbar)^2]/d, where d can be different things. The default is d=n-1. Thus, you will get a very different variance and hence standard deviation by changing the weight from 1 to 5 (no adjustment for the magnitude of the weights). You can adjust for the scale difference by using the statement option VARDEF=WEIGHT. Then, d = sum[weight]. Try:

proc means data = temp2 mean min max std n std VARDEF=WGT;

var weight height;

weight t_wt2;

run;

This will get you close to the same variance and standard deviation as the original. You could also try VARDEF=WEIGHT to get d=sum[weight] - 1.