turn on suggestions

Auto-suggest helps you quickly narrow down your search results by suggesting possible matches as you type.

Showing results for

Find a Community

- Home
- /
- SAS Programming
- /
- SAS Procedures
- /
- Weighted Standard Deviation/Mean - Proc Means

Topic Options

- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

12-08-2016 03:11 PM

I'm working on a productivity report for a group of 10 employees over a time period of 10 months. For each of the 10 employees, I computed mean productivity (widgets produced per 8 hour day) for the 10 month period. I ran a weighted proc means with the weight variable being the total number of widgets each employee produced over the 10 months or Sum_App variable) and the variable for the mean was the individuals mean productivity value for the 10 months(widgets divided by days worked or Over_Avg variable). The weighted mean was 24.80 with a min of 6.11 and a max of 31.96, but the Standard Deviation is showing a value of 288.25 which seems very odd to me considering the range of scores 6.11 to 31.96. Is this standard deviation something that shouldn't be used and is inaccurrate or could it really be correct?

This is the code I used:

**proc** **means** data=lwall ;

var Over_Avg;

weight Sum_App;

**run**;

Accepted Solutions

Solution

12-09-2016
04:10 PM

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

Posted in reply to Bildog1

12-09-2016 11:16 AM

By default, the number of observations (actually N-1) is used for the denominator when computing a standard deviation.

With your application, I suspect a more reasonable computation would divide the weighted deviations by the sum of the weights. You can do this by using the VARDEF= option. The documentation for PROC MEANS contains a discussion of what quantity each computation estimates.

```
proc means data=example VARDEF=WGT;
title "With VARDEF=WGT";
var x;
weight w;
run; title;
```

All Replies

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

Posted in reply to Bildog1

12-08-2016 03:41 PM

Since standard deviation considers the number of observations a bit differently the likely large differences in the weights. Take a look at this and see if a light bulb pops:

data example; input x w; datalines; 6 300 8 40 10 500 12 90 14 800 16 60 18 1000 20 40 22 700 24 80 ; run; proc means data=example; title "With weights"; var x; weight w; run; title; proc means data=example; title "Without weights"; var x; run; title; proc means data=example; title "With FREQ"; var x; freq w; run; title;

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

Posted in reply to ballardw

12-08-2016 03:52 PM

No lightbulb - your code seems to show the same thing as my weighted proc means - Is this crazy high value standard deviation anything that can or should be used? Even yours with a weighted mean of 15.64 and a weighted standard deviation value of 99.83 when the range is 6 to 24 seems inaccurrate or not to be used/trusted.

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

Posted in reply to Bildog1

12-08-2016 04:27 PM

Did my example with FREQ make any sense? Look more like what you might expect?

If you look at the basic formula for standard deviation it is going to use n=10 but your data actually represents many more observations.

You don't show what your actual weight values look like but this note from the documentation might apply:

CAUTION: Single extreme weight values can cause inaccurate results. When one (and only one) weight value is many orders of magnitude larger than the other weight values (for example, 49 weight values of 1 and one weight value of 1×1014), certain statistics might not be within acceptable accuracy limits. The affected statistics are based on the second moment (such as standard deviation, corrected sum of squares, variance, and standard error of the mean). Under certain circumstances, no warning is written to the SAS log.

and

If the values of your variable are counts that represent the number of occurrences of each observation, then use this variable in the FREQ statement rather than in the WEIGHT statement. In this case, because the values are counts, they should be integers.

Solution

12-09-2016
04:10 PM

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

Posted in reply to Bildog1

12-09-2016 11:16 AM

By default, the number of observations (actually N-1) is used for the denominator when computing a standard deviation.

With your application, I suspect a more reasonable computation would divide the weighted deviations by the sum of the weights. You can do this by using the VARDEF= option. The documentation for PROC MEANS contains a discussion of what quantity each computation estimates.

```
proc means data=example VARDEF=WGT;
title "With VARDEF=WGT";
var x;
weight w;
run; title;
```