BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
KJC
Calcite | Level 5 KJC
Calcite | Level 5

I am trying to make sure I correctly use weights when I calculate proc means.

I am trying to take a mean of several observations. Some of the observations come from data that had more individuals contributing, therefore I consider that observation to be more precise compared to the observations that fewer individuals contributed to. I am trying to take the mean where the observations with the higher number of people are weighted more heavily compared to observations that have fewer people.

Here is a snapshot of the data:

n     var1

14     0.80      

14     0.78

14     0.81

13     0.85

11     0.87

10     0.88

9      0.90

8      0.91

I would like to take the mean of var1 and give more weight to the observations with a higher value for n.

Thanks!

1 ACCEPTED SOLUTION

Accepted Solutions
ballardw
Super User

Have you tried running proc means on the data once with Freq  n; and once with Weight n; and without either? You'll see the values for mean and std deviation change.

proc means data = have;

     var var1;

     weight n;

     title "With weight";

run;

proc means data = have;

     var var1;

     freq n;

     title "With Freq";

run;

proc means data = have;

     var var1;

     Title "With neither Freq nor weight";

run;title;

View solution in original post

12 REPLIES 12
data_null__
Jade | Level 19

Can you just use the FREQ statement to tell PROC MEANS the N for each value?

KJC
Calcite | Level 5 KJC
Calcite | Level 5

data_null_, I'm not sure what you mean. Can you provide an example?

Reeza
Super User

Have you looked into the weight statement?

Base SAS(R) 9.2 Procedures Guide

KJC
Calcite | Level 5 KJC
Calcite | Level 5

Yes, I am just trying to make sure I use the weight statement correctly. I don't understand how to properly use the weight statement and the vardiff option.

PaigeMiller
Diamond | Level 26

KJC, the SAS documentation has examples on how to use the WEIGHT statement to compute weighted means. Is that not a good enough explanation for you?

The VARDEF option (not vardiff) is irrelevant for computing means.

--
Paige Miller
KJC
Calcite | Level 5 KJC
Calcite | Level 5

PaigeMiller, I read the explanation but I guess I need a little more clarification.

When using the weight statement, how do I know that the thing being weighted more heavily is the larger value of n rather than something else?

I bring up VARDEF (excuse the mistype) because there are several different options for that and I want to make sure to use the most appropriate one given that the website says this

"When you use the WEIGHT statement, consider which value of the VARDEF= option is appropriate. See the discussion of VARDEF= and the calculation of weighted statistics in Keywords and Formulas for more information. "


I also plan to compute the confidence interval for each mean and I believe I need to include the VARDEF option in order to do that.

PaigeMiller
Diamond | Level 26

When using the weight statement, how do I know that the thing being weighted more heavily is the larger value of n rather than something else?

Because the SAS documentation provides the actual formulas being used

I bring up VARDEF (excuse the mistype) because there are several different options for that and I want to make sure to use the most appropriate one given that the website says this

"When you use the WEIGHT statement, consider which value of the VARDEF= option is appropriate. See the discussion of VARDEF= and the calculation of weighted statistics in Keywords and Formulas for more information. "


I also plan to compute the confidence interval for each mean and I believe I need to include the VARDEF option in order to do that.

Ah, new information ... you're not just computing the mean, you are going to compute the variance or standard deviation. In that case, either VARDEF=WDF or VARDEF=WEIGHT would be appropriate, but without you explaining more about the data and what you are trying to do, I cannot advise you further.

--
Paige Miller
Reeza
Super User

Also look into proc surveymeans.

ballardw
Super User

If your N variable reflects a count of original observations with the value of Var1 you likely actually want to use FREQ N; instead of WEIGHT N;

The Freq option replicates the original data distribution for Var1. The second option in loosely analogous to placing weights with a value of N on a ruler at points with the value of Var1 and determining the mean (balance point). Variances and standard deviations will be significantly difference due to the number of values used in the calculations. With Weight the number of points will be 8, with Freq over 90.

KJC
Calcite | Level 5 KJC
Calcite | Level 5

Ballardw, could you give me some example code describing what you are talking about?

The value "n" is referring to the number of individuals from my sample contributing to the value for "var1".

ballardw
Super User

Have you tried running proc means on the data once with Freq  n; and once with Weight n; and without either? You'll see the values for mean and std deviation change.

proc means data = have;

     var var1;

     weight n;

     title "With weight";

run;

proc means data = have;

     var var1;

     freq n;

     title "With Freq";

run;

proc means data = have;

     var var1;

     Title "With neither Freq nor weight";

run;title;

KJC
Calcite | Level 5 KJC
Calcite | Level 5

thank you!

hackathon24-white-horiz.png

The 2025 SAS Hackathon has begun!

It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.

Latest Updates

What is Bayesian Analysis?

Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 12 replies
  • 5150 views
  • 7 likes
  • 5 in conversation