BookmarkSubscribeRSS Feed
CC3
Calcite | Level 5 CC3
Calcite | Level 5

Hello,

 

I usually use proc univariate to calculate percentiles, and by default, proc univariate ranks the data in ascending order, then calcuates the percentile.

 

Now, I have a score variable that has reversed meaning (a low score means high performance, while a high score means a bad performance). So, basically, I just want to rank the variable in descending order then calculate the percentile. The desired outcome will be a small percentile corresponds to a high score, and a large percentile corresponds to a low score.

 

Does anybody know how to change the default ascending order to descending order in proc univariate when calculating percentile? Or is there any other way that I can achieve my desired outcome using SAS?

 

Thank you so much!

4 REPLIES 4
ballardw
Super User

One approach without a lot of work would be to calculate the percentile as you have in the past. Then use a data step to subract the percentile from 1 (or 100 if the value is coming that way).

 

Or If your current values all 0 or larger (or all 0 or negative) then multiply the variable in a data step by -1 and then generate the percentile. Then multiple the Value by -1 again to get the original value.

 

You might get slightly different results between the two based on ties. If you need to control that then you may want want to consider the second approach an use proc Rank with groups=100

Astounding
PROC Star

Given that you have sorted your data set from highest to lowest score, calculating percentiles is easy enough.

 

Allowing the same score to fall into different percentiles:

 

data want;

set have nobs=_total_obs_;

by descending score;

percentile = ceil(100 * _n_  / _total_obs_);

run;

 

Forcing the same score to fall into the same percentile (possibly causing uneven grouping):

 

data want;

set have nobs=_total_obs_;

by descending score;

retain percentile;

if first.score then percentile = ceil(100 * _n_ / _total_obs_);

run;

 

I can't check the code right now, but I think the formulas are correct.  Also, make sure your incoming data set does not already contain a variable named PERCENTILE.  Otherwise the second program will come up with the wrong answer.

Ksharp
Super User

You can use proc rank to get quantile too.

proc rank data=sashelp.class out=want groups=10 descending;
var weight;
ranks rank;
run;
proc sort data=want;
 by rank;
run;
proc print noobs;run;

Or you can change it by hand after geting quantile by proc univariate.

for example:

 

data want;

 set have;

 rename P95=P05 P10=P90 ........;

run;

mkeintz
PROC Star

Yes, PROC RANK has a "descending" option, but if that options is used, please note that records that are assigned to a common group in normal proc rank will not be guaranteed to stay together using the DESCENDING option, even when there are no ties. 

 

Consider the following, using a modification of the height variable that guarantees no ties.  You'll see that the group with the lowest height values has only one record in dataset ASC, but 2 records in DESC.

 

data class;
  set sashelp.class (keep=name height);
  height=height+_n_/100;
  h2=height;
run;

proc rank data=class out=desc groups=10 descending;
  var height;
run;
proc sort data=desc; by h2 name;run;

proc rank data=class out=asc groups=10 ;
  var height;
run;
proc sort data=asc; by h2 name;run;

 

In other words, even though the group assignment in both cases range from 0 to 9 (or 9 to 0), assignment in one dataset is not always nine minus the assignment in the other.  That is due to the fact that when groups must have differing sizes, the "extra" observations are apparently assigned, one by one, to the highest group id, then the second highest, third highest, etc.

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

What is Bayesian Analysis?

Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 4 replies
  • 4143 views
  • 0 likes
  • 5 in conversation