Hi,
Is the only way to derive a weighted average when some weights are negative to use proc sql or similar?
Can't proc means do it? I find this hard to believe.
Thank you.
Let me ask the dumb question first. What is the meaning of a negative weight in this context? Is it the same as a positive weight for a negative value?
Steve Denham
Indeed it is. Haha I see where you are going...
I could indeed create a second set of variables where the negativity is transferred to the analysis variable.
That's messy (I have many such calculations, and I also need to derive the sum of the weights), but better than doing all the crossings in sql I suppose.
Thanks for the idea.
Still, I can't believe proc means can't do it. It would be so easy, it's mad it's been omitted.
Message was edited by: Christian Graffeuille Mmm... Actually no, the denominator will be too large if I sum the absolute values of the weights. OK, awaiting the next post...
Since I am using several proc means options which will be a huge pain to replicate in proc sql (completetypes,
preloadfmt, and especially mlf), what I'll do for now is:
1) derive the products before doing the summaries,
2) only calculate sums in my numerous proc means,
3) derive ratios.
I am not impressed that negative weights have been barred.
An option would have been so very easy to implement (just leave the data as is), and its absence is such a pain to circumvent.
That ought to solve things.
I am still curious. What is the meaning of a negative weight, if you can share the context?
Steve Denham
I am unsure how much I am allowed to divulge, but for DQ reasons some group totals are misestimated and compensation values are inserted to make the overall totals align. These compensation values can be negative.
Given that, your approach to caluclate sums makes even more sense to me.
Steve Denham
The approach I had to use is needlessly complex though.
proc means is meant to do just this sort of things, and I had instead to fudge a workaround that will be much more difficult to maintain.
Mathematically, a weighted average is only defined for positive weights. The denominator of a weighted mean is the sum of the weights. See the Wikipedia article: Weighted arithmetic mean - Wikipedia, the free encyclopedia
If you attempt to introduce negative weights, all kinds of bad things can happen, such as the denominator being zero. If the weights are w1=+1/2 and w2=-1/2, what is the weighted average of x1=1 and x2=2? Is it the same or different from the weighted average of x1=1 and x2=200?
Of course PROC MEANS computes much more than just means. Computing a weighted variance or quantile are not defnied for negative weights. Do you accept a negative variance? I don't.
I think in most cases negative weights do not make sense. If you can cite a published paper that defines weighted descriptive statistics for negative weights, I would like to read it.
And then there is the real world, Rick.
I agree that negative weights are statistically meaningless.
Just like in my case, negative exposures, or in other cases negative inventories or distances are meaningless.
There is no entry for negative such measures on wikipedia either.
Yet sometimes these calculations have to be made.
If the world, and the world's data, were perfect, you'd be right.
As it is, this calculation is being performed, because it is necessary, and it would be very nice if proc means was unbridled and did for us what it does so well within its present perfect-world statistically-unspoiled boundaries.
We could call the option to enable this feature ngtwgt-real-statisticians-will-never-touch-this if needed, but it would nevertheless be useful in some (possibly rare) cases, and the cost of adding it is very low since it only involves leaving the data is and checking for nil denominators, which I don't doubt is already being considered.
if you don't want SQL, why not use data step ?
Maybe transform your fields so there aren't any negative values?
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.
Find more tutorials on the SAS Users YouTube channel.