BookmarkSubscribeRSS Feed
nathanleggett
Calcite | Level 5

I have this data:

Spoiler
data dat;
   input ID 10 domain rowpercent;
   datalines;
RZL34500 1 2.8
RZL34500 2 0.0
RZL34500 3 71.1
RZL34500 4 0.0
RZL34500 5 4.8
RZL34500 6 2.8
RZL34500 7 2.9
RZL34500 8 4.8
RZL34500 9 4.8
RZL34500 10 0.0
RZL34500 11 0.0
RZL34500 12 0.0
RZL34500 13 0.0
RZL34500 14 2.5
RZL34500 15 0.0
RZL34500 16 0.0
RZL34500 17 0.0
RZL34500 18 0.0
RZL34500 19 0.0
;

I am trying to rank it so that higher values are ranked higher:

 

proc rank data=dat out=temp_1 ties=mean descending;
	by id;
	var RowPercent;
	ranks rank;
run;

My results are:

 

ID	         domain rowpercent	rank
RZL34500	1	2.8	                 6.5
RZL34500	2	0	                 14
RZL34500	3	71.1	                 1
RZL34500	4	0	                 14
RZL34500	5	4.8	                 3
RZL34500	6	2.8	                 6.5
RZL34500	7	2.9	                 5
RZL34500	8	4.8	                 3
RZL34500	9	4.8	                 3
RZL34500	10	0	                 14
RZL34500	11	0	                 14
RZL34500	12	0	                 14
RZL34500	13	0	                 14
RZL34500	14	2.5	                 8
RZL34500	15	0	                 14
RZL34500	16	0	                 14
RZL34500	17	0	                 14
RZL34500	18	0	                 14
RZL34500	19	0	                 14

Why are there multiple '3's and '14's?

 

Secondary question: Is there a way that I can rank these values such that I get an accurate 'top 3'?

 

Thanks in advance

2 REPLIES 2
Reeza
Super User
Because you have ties....a value of 0 is a value of 0 and it will have the same rank as other 0s. Is there another way you'd like ties to be accounted for? You could just sort and create a counter but then you don't really have a true indication of the rank because does a rank of 15 mean anything if the rank of 14 and 16 have the same underlying value?
FreelanceReinh
Jade | Level 19

Hello @nathanleggett and welcome to the SAS Support Communities!

 

First off a correction to your INPUT statement in order to make the DATA step work:

input ID :$10. domain rowpercent;

@nathanleggett wrote:

Why are there multiple '3's and '14's?


[As Reeza has pointed out already:] This is because there are tied observations, i.e. groups of observations with the same ROWPERCENT value, and PROC RANK assigns the same rank to all observations within a group. For instance, the 2nd, 3rd and 4th largest value are 4.8 so that each of them is assigned rank 3, the average of 2, 3 and 4 because you specified ties=mean.


Secondary question: Is there a way that I can rank these values such that I get an accurate 'top 3'?


Sure, but you need to decide how to handle the ties. For example, you could select all observations with the top three values by using

proc rank data=dat out=temp_1(where=(rank<=3)) ties=dense descending;

and then restrict dataset TEMP_1 to three observations in a subsequent step if needed.

 

Or use a different procedure instead of PROC RANK:

 

E.g., PROC SUMMARY:

proc summary data=dat;
by id;
var rowpercent;
output out=top3(drop=_:) idgrp (max(rowpercent) out[3] (rowpercent domain)=) / autoname;
run;

(There are options to modify the handling of ties.)

 

Or PROC UNIVARIATE:

ods select none;
ods output extremeobs=top3obs(drop=low: varname);
proc univariate data=dat nextrobs=3;
by id;
var rowpercent;
run;
ods select all;

(A variant of this could select the top three values [disregarding multiplicities] rather than the top three observations.)

 

 

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 2 replies
  • 1399 views
  • 0 likes
  • 3 in conversation