Turn on suggestions

Auto-suggest helps you quickly narrow down your search results by suggesting possible matches as you type.

Showing results for

- Home
- /
- Programming
- /
- SAS Procedures
- /
- Weighted Standard Deviation/Mean - Proc Means

Options

- RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Mute
- Printer Friendly Page

🔒 This topic is **solved** and **locked**.
Need further help from the community? Please
sign in and ask a **new** question.

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

Posted 12-08-2016 03:11 PM
(10757 views)

I'm working on a productivity report for a group of 10 employees over a time period of 10 months. For each of the 10 employees, I computed mean productivity (widgets produced per 8 hour day) for the 10 month period. I ran a weighted proc means with the weight variable being the total number of widgets each employee produced over the 10 months or Sum_App variable) and the variable for the mean was the individuals mean productivity value for the 10 months(widgets divided by days worked or Over_Avg variable). The weighted mean was 24.80 with a min of 6.11 and a max of 31.96, but the Standard Deviation is showing a value of 288.25 which seems very odd to me considering the range of scores 6.11 to 31.96. Is this standard deviation something that shouldn't be used and is inaccurrate or could it really be correct?

This is the code I used:

**proc** **means** data=lwall ;

var Over_Avg;

weight Sum_App;

**run**;

1 ACCEPTED SOLUTION

Accepted Solutions

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

By default, the number of observations (actually N-1) is used for the denominator when computing a standard deviation.

With your application, I suspect a more reasonable computation would divide the weighted deviations by the sum of the weights. You can do this by using the VARDEF= option. The documentation for PROC MEANS contains a discussion of what quantity each computation estimates.

```
proc means data=example VARDEF=WGT;
title "With VARDEF=WGT";
var x;
weight w;
run; title;
```

4 REPLIES 4

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

Since standard deviation considers the number of observations a bit differently the likely large differences in the weights. Take a look at this and see if a light bulb pops:

data example; input x w; datalines; 6 300 8 40 10 500 12 90 14 800 16 60 18 1000 20 40 22 700 24 80 ; run; proc means data=example; title "With weights"; var x; weight w; run; title; proc means data=example; title "Without weights"; var x; run; title; proc means data=example; title "With FREQ"; var x; freq w; run; title;

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

Did my example with FREQ make any sense? Look more like what you might expect?

If you look at the basic formula for standard deviation it is going to use n=10 but your data actually represents many more observations.

You don't show what your actual weight values look like but this note from the documentation might apply:

CAUTION: Single extreme weight values can cause inaccurate results. When one (and only one) weight value is many orders of magnitude larger than the other weight values (for example, 49 weight values of 1 and one weight value of 1×1014), certain statistics might not be within acceptable accuracy limits. The affected statistics are based on the second moment (such as standard deviation, corrected sum of squares, variance, and standard error of the mean). Under certain circumstances, no warning is written to the SAS log.

and

If the values of your variable are counts that represent the number of occurrences of each observation, then use this variable in the FREQ statement rather than in the WEIGHT statement. In this case, because the values are counts, they should be integers.

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

By default, the number of observations (actually N-1) is used for the denominator when computing a standard deviation.

With your application, I suspect a more reasonable computation would divide the weighted deviations by the sum of the weights. You can do this by using the VARDEF= option. The documentation for PROC MEANS contains a discussion of what quantity each computation estimates.

```
proc means data=example VARDEF=WGT;
title "With VARDEF=WGT";
var x;
weight w;
run; title;
```

Build your skills. Make connections. Enjoy creative freedom. Maybe change the world. **Registration is now open through August 30th**. Visit the SAS Hackathon homepage.

What is Bayesian Analysis?

Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.

Find more tutorials on the SAS Users YouTube channel.