## Interpretation od weighted percentiles

I know PROC UNIVARIATE willgive me weighted percentiles, but how do I interpret the results? If the weighted median is 5.5, what does that say about my data?  I guess I don't understand how the weights are effecting the stiatistics.

Here is an example:

``````data Have;
input x w;
datalines;
1 1
2 2
3 1
4 2
5 2
6 4
7 3
8 1
;
run;

proc univariate data=Have;
var x;
weight w;
ods select quantiles;
run;``````

When I run this code I get

Q3=6.5
Median=5.5
Q1=3.5

I don't get these values. Why is the median 5.5? And the other quartiles?

1 ACCEPTED SOLUTION

Accepted Solutions

## Re: Interpretation od weighted percentiles

For an interpretation of weighted percentiles, see the article "Weighted percentiles."

The basic idea is to sort the data in increasing order (your data are already sorted). Then add up the cumulative weights and take the percentiles of the total weight.

For your data, the weights sum to 16.  The 50th percentile is therefore the data value for which half the weight is on one side and half is on the other. If you run down your data, you see that any number between 5 and 6 has half the weight (8 units) on both sides.

The other percentiles are similar. The 25th percentile is the data value for which 25% of the weight (=4 units) is below and 75% (=12 units) is above. For your data, any number between 3 and 4 has that property.

7 REPLIES 7

## Re: Interpretation od weighted percentiles

For an interpretation of weighted percentiles, see the article "Weighted percentiles."

The basic idea is to sort the data in increasing order (your data are already sorted). Then add up the cumulative weights and take the percentiles of the total weight.

For your data, the weights sum to 16.  The 50th percentile is therefore the data value for which half the weight is on one side and half is on the other. If you run down your data, you see that any number between 5 and 6 has half the weight (8 units) on both sides.

The other percentiles are similar. The 25th percentile is the data value for which 25% of the weight (=4 units) is below and 75% (=12 units) is above. For your data, any number between 3 and 4 has that property.

## Re: Interpretation od weighted percentiles

@Rick_SAS That is an amazinf article! So clera!

Why wont UNIVARIATE create any graphs? I tried to make a histogram but it complains the the graphs cant be create if I use a weight.

## Re: Interpretation od weighted percentiles

Weighted graphics are a complicated topic for which statisticians have not reached a consensus. However, if you want to visualize the weighted distribution, you can create a weighted empirical CDF, as shown in the article that I mentioned earlier. For your data, the weighted ECDF would look like this:

``````data Have;
input x w;
datalines;
1 1
2 2
3 1
4 2
5 2
6 4
7 3
8 1
;
run;

title "Weighted Percentiles";

/* put sum of weights into macro variable */
proc sql noprint;
select sum(w) into :sumWt from Have;
quit;
%put &=sumWt;   /* display value in SAS log */

data Want;
set Have;
wt = w / &sumWt;   /* standardize Sum(wt)=1 */
run;

proc means data=Want p25 median p75;
var x;
weight wt;
run;

/* use IML to form weighted ECDF from data */
proc iml;
use Want; read all var {x wt}; close;
cumWt = cusum(wt);
cutPts = 0 // cumWt;

/* generate data for WECDF */
t = do(0, 0.999, 0.001);
idx = bin(t, cutPts);
q = x[idx];

create WECDF var {t q x}; append; close;
QUIT;

title "Weighted ECDF";
proc sgplot data=wecdf noautolegend;
xaxis grid label="x";
yaxis grid offsetmin=0.1 label="Cumulative Proportion";
step x=q y=t;
fringe x / lineattrs=(color=black);
refline 0 / axis=y;
run;
``````

## Re: Interpretation od weighted percentiles

You say your weighted median is 3.2? Explain how you calculated this.

Here's how SAS gets these values ... it uses observation 1 one time and observation 2 two times and observation 6 four times, etc.

As if the data set HAVE1 was provided instead of HAVE

``````data have1;
set have;
do i=1 to w;
output;
end;
drop i;
run;``````

Then, PROC UNIVARIATE on HAVE1 without the weight statement gives the same median as PROC UNIVARIATE on HAVE with the weight statement.

--
Paige Miller

## Re: Interpretation od weighted percentiles

I did explain. I ran PROC UNIVARIATE.

## Re: Interpretation od weighted percentiles

@WeiChen wrote:

I did explain. I ran PROC UNIVARIATE.

More information is needed. What PROC UNIVARIATE code gives a median of 3.2 for data set HAVE?????

--
Paige Miller

## Re: Interpretation od weighted percentiles

Oh, sorry. I used 3.2 as a hypothetical example. But then I decided to add example data and a PROC UNIVARIATE statement and forgot to update my sentenese. I will do that now.

Discussion stats
• 7 replies
• 297 views
• 7 likes
• 3 in conversation