BookmarkSubscribeRSS Feed
ChrisNZ
Tourmaline | Level 20

Hi,

Is the only way to derive a weighted average when some weights are negative to use proc sql or similar?

Can't proc means do it? I find this hard to believe.

Thank you.

11 REPLIES 11
SteveDenham
Jade | Level 19

Let me ask the dumb question first.  What is the meaning of a negative weight in this context?  Is it the same as a positive weight for a negative value?

Steve Denham

ChrisNZ
Tourmaline | Level 20

Indeed it is. Haha I see where you are going...

I could indeed create a second set of variables where the negativity is transferred to the analysis variable.

That's messy (I have many such calculations, and I also need to derive the sum of the weights), but better than doing all the crossings in sql I suppose.

Thanks for the idea. Smiley Happy

Still, I can't believe proc means can't do it. It would be so easy, it's mad it's been omitted.

Message was edited by: Christian Graffeuille Mmm... Actually no, the denominator will be too large if I sum the absolute values of the weights. OK, awaiting the next post...

ChrisNZ
Tourmaline | Level 20

Since I am using several proc means options which will be a huge pain to replicate in proc sql (completetypes,

preloadfmt, and especially mlf), what I'll do for now is:

1) derive the products before doing the summaries,

2) only calculate sums in my numerous proc means,

3) derive ratios.

I am not impressed that negative weights have been barred.

An option would have been so very easy to implement (just leave the data as is), and its absence is such a pain to circumvent.

SteveDenham
Jade | Level 19

That ought to solve things.

I am still curious.  What is the meaning of a negative weight, if you can share the context?

Steve Denham

ChrisNZ
Tourmaline | Level 20

I am unsure how much I am allowed to divulge, but for DQ reasons some group totals are misestimated and compensation values are inserted to make the overall totals align. These compensation values can be negative.

SteveDenham
Jade | Level 19

Given that, your approach to caluclate sums makes even more sense to me.

Steve Denham

ChrisNZ
Tourmaline | Level 20

The approach I had to use is needlessly complex though.

proc means is meant to do just this sort of things, and I had instead to fudge a workaround that will be much more difficult to maintain.

Rick_SAS
SAS Super FREQ

Mathematically, a weighted average is only defined for positive weights. The denominator of a weighted mean is the sum of the weights.  See the Wikipedia article: Weighted arithmetic mean - Wikipedia, the free encyclopedia

If you attempt to introduce negative weights, all kinds of bad things can happen, such as the denominator being zero.  If the weights are w1=+1/2 and w2=-1/2, what is the weighted average of x1=1 and x2=2?  Is it the same or different from the weighted average of x1=1 and x2=200?

Of course PROC MEANS computes much more than just means. Computing a weighted variance or quantile are not defnied for negative weights. Do you accept a negative variance?  I don't.  

I think in most cases negative weights do not make sense. If you can cite a published paper that defines weighted descriptive statistics for negative weights, I would like to read it.

ChrisNZ
Tourmaline | Level 20

And then there is the real world, Rick.

I agree that negative weights are statistically meaningless.

Just like in my case, negative exposures, or in other cases negative inventories or distances are meaningless.

There is no entry for negative such measures on wikipedia either.

Yet sometimes these calculations have to be made.

If the world, and the world's data, were perfect, you'd be right.

As it is, this calculation is being performed, because it is necessary, and it would be very nice if proc means was unbridled and did for us what it does so well within its present perfect-world statistically-unspoiled boundaries.

We could call the option to enable this feature ngtwgt-real-statisticians-will-never-touch-this if needed, but it would nevertheless be useful in some (possibly rare) cases, and the cost of adding it is very low since it only involves leaving the data is and checking for nil denominators, which I don't doubt is already being considered.

Ksharp
Super User

if you don't want SQL, why not use data step ?

BJM
Calcite | Level 5 BJM
Calcite | Level 5

Maybe transform your fields so there aren't any negative values?

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

What is Bayesian Analysis?

Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 11 replies
  • 9881 views
  • 1 like
  • 5 in conversation