Select only the highest-ranked in each ranking group using the Rank column (1 is the highest possible rank).
.
contract Rankgroup CC Rank
1001 Neoplasm 1 CC-10 4
1001 Neoplasm 1 CC-9 2
1001 Neoplasm 1 CC-7 3
1001 Neoplasm 2 CC-10 1
1001 Neoplasm 2 CC-10 1
1001 Neoplasm 2 CC-7 3
data test;
input contract Rankgroup $5-15 CC $ Rank ;
datalines;
1001 Neoplasm 1 CC-10 4
1001 Neoplasm 1 CC-9 2
1001 Neoplasm 1 CC-7 3
1001 Neoplasm 2 CC-10 1
1001 Neoplasm 2 CC-10 1
1001 Neoplasm 2 CC-7 3
;
RUN;
PROC PRINT; RUN;
PROC SQL;
SELECT * FROM test WHERE RANK IN (SELECT MIN(RANK) FROM tesT);
QUIT;
Or, with just one pass through the data:
proc sql;
create table want as
select *
from have
group by rankgroup
having rank=min(rank)
;
quit;
Not arguing, Art, just taking a learning opportunity. I get the "query requires remerging" message from your query, so Proc Sql technically is reading the data twice, no?
Also, SasPhile, Art's elegant query reproduces the duplicate record you have in your original dataset. If you don't want that, you can add "distinct" as in "select distinct *. . .".
Karl
Karl, I agree! Conversely, Hima's code has an additional problem, in that a group by clause would have to be added in order for the OP to obtain the desired result.
SASPhile,
Other posters have correctly pointed out ... you have to decide what should happen when the same Rank appears multiple times within the same grouping.
I'm just asking about something else here ... Did you really mean to select the highest rank for each Rankgroup? Or should it be the combination of contract and Rankgroup together that determines the grouping?
Good luck.
Astounding,
select the highest rank for each Rankgroup.
SasPhile,
In that case, you have solutions posted by others that should work.
If you want to select just one record per Rankgroup (even if many are tied for the highest Rank), try Keith's below. If you want to select all records from a Rankgroup that share the highest Rank, try one of the SQL solutions above.
If you posted a solution that I misinterpreted, please say so. But I don't think DISTINCT(*) would limit the selection to one observation per Rankgroup if the variable CC might be different.
Good luck.
If you only want one of the highest rank records returned, then proc summary does this very easily.
proc summary data=test nway;
output out=top_rank (drop=_:) minid(rank(_all_))=;
run;
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.