Turn on suggestions

Auto-suggest helps you quickly narrow down your search results by suggesting possible matches as you type.

Showing results for

- Home
- /
- Programming
- /
- SAS Procedures
- /
- Interpretation od weighted percentiles

Options

- RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Mute
- Printer Friendly Page

☑ This topic is **solved**.
Need further help from the community? Please
sign in and ask a **new** question.

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

Posted 01-10-2022 10:28 AM
(296 views)

I know PROC UNIVARIATE willgive me weighted percentiles, but how do I interpret the results? If the weighted median is 5.5, what does that say about my data? I guess I don't understand how the weights are effecting the stiatistics.

Here is an example:

```
data Have;
input x w;
datalines;
1 1
2 2
3 1
4 2
5 2
6 4
7 3
8 1
;
run;
proc univariate data=Have;
var x;
weight w;
ods select quantiles;
run;
```

When I run this code I get

Q3=6.5

Median=5.5

Q1=3.5

I don't get these values. Why is the median 5.5? And the other quartiles?

1 ACCEPTED SOLUTION

Accepted Solutions

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

For an interpretation of weighted percentiles, see the article "Weighted percentiles."

The basic idea is to sort the data in increasing order (your data are already sorted). Then add up the cumulative weights and take the percentiles of the total weight.

For your data, the weights sum to 16. The 50th percentile is therefore the data value for which half the weight is on one side and half is on the other. If you run down your data, you see that any number between 5 and 6 has half the weight (8 units) on both sides.

The other percentiles are similar. The 25th percentile is the data value for which 25% of the weight (=4 units) is below and 75% (=12 units) is above. For your data, any number between 3 and 4 has that property.

7 REPLIES 7

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

For an interpretation of weighted percentiles, see the article "Weighted percentiles."

The basic idea is to sort the data in increasing order (your data are already sorted). Then add up the cumulative weights and take the percentiles of the total weight.

For your data, the weights sum to 16. The 50th percentile is therefore the data value for which half the weight is on one side and half is on the other. If you run down your data, you see that any number between 5 and 6 has half the weight (8 units) on both sides.

The other percentiles are similar. The 25th percentile is the data value for which 25% of the weight (=4 units) is below and 75% (=12 units) is above. For your data, any number between 3 and 4 has that property.

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

@Rick_SAS That is an amazinf article! So clera!

Why wont UNIVARIATE create any graphs? I tried to make a histogram but it complains the the graphs cant be create if I use a weight.

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

Weighted graphics are a complicated topic for which statisticians have not reached a consensus. However, if you want to visualize the weighted distribution, you can create a weighted empirical CDF, as shown in the article that I mentioned earlier. For your data, the weighted ECDF would look like this:

```
data Have;
input x w;
datalines;
1 1
2 2
3 1
4 2
5 2
6 4
7 3
8 1
;
run;
title "Weighted Percentiles";
/* put sum of weights into macro variable */
proc sql noprint;
select sum(w) into :sumWt from Have;
quit;
%put &=sumWt; /* display value in SAS log */
data Want;
set Have;
wt = w / &sumWt; /* standardize Sum(wt)=1 */
run;
proc means data=Want p25 median p75;
var x;
weight wt;
run;
/* use IML to form weighted ECDF from data */
proc iml;
use Want; read all var {x wt}; close;
cumWt = cusum(wt);
cutPts = 0 // cumWt;
/* generate data for WECDF */
t = do(0, 0.999, 0.001);
idx = bin(t, cutPts);
q = x[idx];
create WECDF var {t q x}; append; close;
QUIT;
title "Weighted ECDF";
proc sgplot data=wecdf noautolegend;
xaxis grid label="x";
yaxis grid offsetmin=0.1 label="Cumulative Proportion";
step x=q y=t;
fringe x / lineattrs=(color=black);
refline 0 / axis=y;
run;
```

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

You say your weighted median is 3.2? Explain how you calculated this.

Here's how SAS gets these values ... it uses observation 1 one time and observation 2 two times and observation 6 four times, etc.

As if the data set HAVE1 was provided instead of HAVE

```
data have1;
set have;
do i=1 to w;
output;
end;
drop i;
run;
```

Then, PROC UNIVARIATE on HAVE1 without the weight statement gives the same median as PROC UNIVARIATE on HAVE with the weight statement.

--

Paige Miller

Paige Miller

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

I did explain. I ran PROC UNIVARIATE.

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

@WeiChen wrote:

I did explain. I ran PROC UNIVARIATE.

More information is needed. What PROC UNIVARIATE code gives a median of 3.2 for data set HAVE?????

--

Paige Miller

Paige Miller

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

SAS is headed **back** to Vegas for an AI and analytics experience like no other! Whether you're an executive, manager, end user or SAS partner, SAS Innovate is designed for everyone on your team.

**Interested in speaking?** Content from our attendees is one of the reasons that makes SAS Innovate such a special event!

What is Bayesian Analysis?

Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.

Find more tutorials on the SAS Users YouTube channel.