turn on suggestions

Auto-suggest helps you quickly narrow down your search results by suggesting possible matches as you type.

Showing results for

Find a Community

- Home
- /
- SAS Programming
- /
- SAS Procedures
- /
- Calculate percentile using data in descending orde...

Topic Options

- RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page

- Mark as New
- Bookmark
- Subscribe
- RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

05-18-2017 03:02 PM

Hello,

I usually use proc univariate to calculate percentiles, and by default, proc univariate ranks the data in ascending order, then calcuates the percentile.

Now, I have a score variable that has reversed meaning (a low score means high performance, while a high score means a bad performance). So, basically, I just want to rank the variable in descending order then calculate the percentile. The desired outcome will be a small percentile corresponds to a high score, and a large percentile corresponds to a low score.

Does anybody know how to change the default ascending order to descending order in proc univariate when calculating percentile? Or is there any other way that I can achieve my desired outcome using SAS?

Thank you so much!

- Mark as New
- Bookmark
- Subscribe
- RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

05-18-2017 04:07 PM

One approach without a lot of work would be to calculate the percentile as you have in the past. Then use a data step to subract the percentile from 1 (or 100 if the value is coming that way).

Or If your current values all 0 or larger (or all 0 or negative) then multiply the variable in a data step by -1 and then generate the percentile. Then multiple the Value by -1 again to get the original value.

You might get slightly different results between the two based on ties. If you need to control that then you may want want to consider the second approach an use proc Rank with groups=100

- Mark as New
- Bookmark
- Subscribe
- RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

05-18-2017 04:37 PM

Given that you have sorted your data set from highest to lowest score, calculating percentiles is easy enough.

Allowing the same score to fall into different percentiles:

data want;

set have nobs=_total_obs_;

by descending score;

percentile = ceil(100 * _n_ / _total_obs_);

run;

Forcing the same score to fall into the same percentile (possibly causing uneven grouping):

data want;

set have nobs=_total_obs_;

by descending score;

retain percentile;

if first.score then percentile = ceil(100 * _n_ / _total_obs_);

run;

I can't check the code right now, but I think the formulas are correct. Also, make sure your incoming data set does not already contain a variable named PERCENTILE. Otherwise the second program will come up with the wrong answer.

- Mark as New
- Bookmark
- Subscribe
- RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

05-19-2017 10:17 AM

You can use proc rank to get quantile too.

```
proc rank data=sashelp.class out=want groups=10 descending;
var weight;
ranks rank;
run;
proc sort data=want;
by rank;
run;
proc print noobs;run;
```

Or you can change it by hand after geting quantile by proc univariate.

for example:

data want;

set have;

rename P95=P05 P10=P90 ........;

run;

- Mark as New
- Bookmark
- Subscribe
- RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to Ksharp

05-22-2017 03:15 PM

Yes, PROC RANK has a "descending" option, but if that options is used, please note that records that are assigned to a common group in normal proc rank will not be guaranteed to stay together using the DESCENDING option, even when there are no ties.

Consider the following, using a modification of the height variable that guarantees no ties. You'll see that the group with the lowest height values has only one record in dataset ASC, but 2 records in DESC.

```
data class;
set sashelp.class (keep=name height);
height=height+_n_/100;
h2=height;
run;
proc rank data=class out=desc groups=10 descending;
var height;
run;
proc sort data=desc; by h2 name;run;
proc rank data=class out=asc groups=10 ;
var height;
run;
proc sort data=asc; by h2 name;run;
```

In other words, even though the group assignment in both cases range from 0 to 9 (or 9 to 0), assignment in one dataset is not always nine minus the assignment in the other. That is due to the fact that when groups must have differing sizes, the "extra" observations are apparently assigned, one by one, to the highest group id, then the second highest, third highest, etc.