Hello @nathanleggett and welcome to the SAS Support Communities!
First off a correction to your INPUT statement in order to make the DATA step work:
input ID :$10. domain rowpercent;
@nathanleggett wrote:
Why are there multiple '3's and '14's?
[As Reeza has pointed out already:] This is because there are tied observations, i.e. groups of observations with the same ROWPERCENT value, and PROC RANK assigns the same rank to all observations within a group. For instance, the 2nd, 3rd and 4th largest value are 4.8 so that each of them is assigned rank 3, the average of 2, 3 and 4 because you specified ties=mean.
Secondary question: Is there a way that I can rank these values such that I get an accurate 'top 3'?
Sure, but you need to decide how to handle the ties. For example, you could select all observations with the top three values by using
proc rank data=dat out=temp_1(where=(rank<=3)) ties=dense descending;
and then restrict dataset TEMP_1 to three observations in a subsequent step if needed.
Or use a different procedure instead of PROC RANK:
E.g., PROC SUMMARY:
proc summary data=dat;
by id;
var rowpercent;
output out=top3(drop=_:) idgrp (max(rowpercent) out[3] (rowpercent domain)=) / autoname;
run;
(There are options to modify the handling of ties.)
Or PROC UNIVARIATE:
ods select none;
ods output extremeobs=top3obs(drop=low: varname);
proc univariate data=dat nextrobs=3;
by id;
var rowpercent;
run;
ods select all;
(A variant of this could select the top three values [disregarding multiplicities] rather than the top three observations.)
... View more