Help using Base SAS procedures

Calculate percentile using data in descending order

Reply
Occasional Learner CC3
Occasional Learner
Posts: 1

Calculate percentile using data in descending order

Hello,

 

I usually use proc univariate to calculate percentiles, and by default, proc univariate ranks the data in ascending order, then calcuates the percentile.

 

Now, I have a score variable that has reversed meaning (a low score means high performance, while a high score means a bad performance). So, basically, I just want to rank the variable in descending order then calculate the percentile. The desired outcome will be a small percentile corresponds to a high score, and a large percentile corresponds to a low score.

 

Does anybody know how to change the default ascending order to descending order in proc univariate when calculating percentile? Or is there any other way that I can achieve my desired outcome using SAS?

 

Thank you so much!

Super User
Posts: 11,343

Re: Calculate percentile using data in descending order

One approach without a lot of work would be to calculate the percentile as you have in the past. Then use a data step to subract the percentile from 1 (or 100 if the value is coming that way).

 

Or If your current values all 0 or larger (or all 0 or negative) then multiply the variable in a data step by -1 and then generate the percentile. Then multiple the Value by -1 again to get the original value.

 

You might get slightly different results between the two based on ties. If you need to control that then you may want want to consider the second approach an use proc Rank with groups=100

Super User
Posts: 5,516

Re: Calculate percentile using data in descending order

Given that you have sorted your data set from highest to lowest score, calculating percentiles is easy enough.

 

Allowing the same score to fall into different percentiles:

 

data want;

set have nobs=_total_obs_;

by descending score;

percentile = ceil(100 * _n_  / _total_obs_);

run;

 

Forcing the same score to fall into the same percentile (possibly causing uneven grouping):

 

data want;

set have nobs=_total_obs_;

by descending score;

retain percentile;

if first.score then percentile = ceil(100 * _n_ / _total_obs_);

run;

 

I can't check the code right now, but I think the formulas are correct.  Also, make sure your incoming data set does not already contain a variable named PERCENTILE.  Otherwise the second program will come up with the wrong answer.

Super User
Posts: 10,044

Re: Calculate percentile using data in descending order

You can use proc rank to get quantile too.

proc rank data=sashelp.class out=want groups=10 descending;
var weight;
ranks rank;
run;
proc sort data=want;
 by rank;
run;
proc print noobs;run;

Or you can change it by hand after geting quantile by proc univariate.

for example:

 

data want;

 set have;

 rename P95=P05 P10=P90 ........;

run;

Trusted Advisor
Posts: 1,022

Re: Calculate percentile using data in descending order

Yes, PROC RANK has a "descending" option, but if that options is used, please note that records that are assigned to a common group in normal proc rank will not be guaranteed to stay together using the DESCENDING option, even when there are no ties. 

 

Consider the following, using a modification of the height variable that guarantees no ties.  You'll see that the group with the lowest height values has only one record in dataset ASC, but 2 records in DESC.

 

data class;
  set sashelp.class (keep=name height);
  height=height+_n_/100;
  h2=height;
run;

proc rank data=class out=desc groups=10 descending;
  var height;
run;
proc sort data=desc; by h2 name;run;

proc rank data=class out=asc groups=10 ;
  var height;
run;
proc sort data=asc; by h2 name;run;

 

In other words, even though the group assignment in both cases range from 0 to 9 (or 9 to 0), assignment in one dataset is not always nine minus the assignment in the other.  That is due to the fact that when groups must have differing sizes, the "extra" observations are apparently assigned, one by one, to the highest group id, then the second highest, third highest, etc.

Ask a Question
Discussion stats
  • 4 replies
  • 231 views
  • 0 likes
  • 5 in conversation