turn on suggestions

Auto-suggest helps you quickly narrow down your search results by suggesting possible matches as you type.

Showing results for

Find a Community

- Home
- /
- SAS Programming
- /
- SAS Procedures
- /
- Weighting using Proc Means

Topic Options

- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

03-20-2014 01:25 PM

I am trying to make sure I correctly use weights when I calculate proc means.

I am trying to take a mean of several observations. Some of the observations come from data that had more individuals contributing, therefore I consider that observation to be more precise compared to the observations that fewer individuals contributed to. I am trying to take the mean where the observations with the higher number of people are weighted more heavily compared to observations that have fewer people.

Here is a snapshot of the data:

n var1

14 0.80

14 0.78

14 0.81

13 0.85

11 0.87

10 0.88

9 0.90

8 0.91

I would like to take the mean of var1 and give more weight to the observations with a higher value for n.

Thanks!

Accepted Solutions

Solution

03-20-2014
04:14 PM

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

03-20-2014 04:14 PM

Have you tried running proc means on the data once with Freq n; and once with Weight n; and without either? You'll see the values for mean and std deviation change.

proc means data = have;

var var1;

weight n;

title "With weight";

run;

proc means data = have;

var var1;

freq n;

title "With Freq";

run;

proc means data = have;

var var1;

Title "With neither Freq nor weight";

run;title;

All Replies

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

03-20-2014 02:05 PM

Can you just use the FREQ statement to tell PROC MEANS the N for each value?

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

Posted in reply to data_null__

03-20-2014 02:22 PM

data_null_, I'm not sure what you mean. Can you provide an example?

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

03-20-2014 02:19 PM

Have you looked into the weight statement?

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

Posted in reply to Reeza

03-20-2014 02:22 PM

Yes, I am just trying to make sure I use the weight statement correctly. I don't understand how to properly use the weight statement and the vardiff option.

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

03-20-2014 02:41 PM

KJC, the SAS documentation has examples on how to use the WEIGHT statement to compute weighted means. Is that not a good enough explanation for you?

The VARDEF option (not vardiff) is irrelevant for computing means.

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

Posted in reply to PaigeMiller

03-20-2014 02:51 PM

PaigeMiller, I read the explanation but I guess I need a little more clarification.

When using the weight statement, how do I know that the thing being weighted more heavily is the **larger** value of n rather than something else?

I bring up VARDEF (excuse the mistype) because there are several different options for that and I want to make sure to use the most appropriate one given that the website says this

"When you use the WEIGHT statement, consider which value of the VARDEF= option is appropriate. See the discussion of VARDEF= and the calculation of weighted statistics in Keywords and Formulas for more information. "

I also plan to compute the confidence interval for each mean and I believe I need to include the VARDEF option in order to do that.

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

03-20-2014 03:01 PM

When using the weight statement, how do I know that the thing being weighted more heavily is the

largervalue of n rather than something else?

Because the SAS documentation provides the actual formulas being used

I bring up VARDEF (excuse the mistype) because there are several different options for that and I want to make sure to use the most appropriate one given that the website says this

"When you use the WEIGHT statement, consider which value of the VARDEF= option is appropriate. See the discussion of VARDEF= and the calculation of weighted statistics in Keywords and Formulas for more information. "

I also plan to compute the confidence interval for each mean and I believe I need to include the VARDEF option in order to do that.

Ah, new information ... you're not just computing the mean, you are going to compute the variance or standard deviation. In that case, either VARDEF=WDF or VARDEF=WEIGHT would be appropriate, but without you explaining more about the data and what you are trying to do, I cannot advise you further.

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

Posted in reply to PaigeMiller

03-20-2014 03:02 PM

Also look into proc surveymeans.

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

03-20-2014 03:26 PM

If your N variable reflects a count of original observations with the value of Var1 you likely actually want to use FREQ N; instead of WEIGHT N;

The Freq option replicates the original data distribution for Var1. The second option in loosely analogous to placing weights with a value of N on a ruler at points with the value of Var1 and determining the mean (balance point). Variances and standard deviations will be significantly difference due to the number of values used in the calculations. With Weight the number of points will be 8, with Freq over 90.

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

Posted in reply to ballardw

03-20-2014 03:36 PM

Ballardw, could you give me some example code describing what you are talking about?

The value "n" is referring to the number of individuals from my sample contributing to the value for "var1".

Solution

03-20-2014
04:14 PM

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

03-20-2014 04:14 PM

Have you tried running proc means on the data once with Freq n; and once with Weight n; and without either? You'll see the values for mean and std deviation change.

proc means data = have;

var var1;

weight n;

title "With weight";

run;

proc means data = have;

var var1;

freq n;

title "With Freq";

run;

proc means data = have;

var var1;

Title "With neither Freq nor weight";

run;title;

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

Posted in reply to ballardw

03-21-2014 03:43 PM

thank you!