Hello
The following query select one observation that its weight value is second highest.
Data class;
input Cust_ID weight age;
1 59 20
2 61 25
3 95 40
4 92 25
5 51 18
6 92 21
7 50 43
8 60 23
9 70 71
10 92 62
;
Run;
Data class;
set sashelp.class;
Run;
proc sql outobs = 1;
select name, weight
from class
where weight not in (select max(weight) from class)
order by weight desc
;quit;
My question-
Let's say that there are multiple observations with weight=second highest value and I want to see these observations. What is the way to do it in one step? (So in this example I will see 3 observations since there are 3 observations with weight=92)
Use PROC RANK.
Please look at the examples at the link I gave. One of the examples is analogous to your problem, only trivial changes are needed to use that example in your situation.
/*
It is not right tool by using SQL.
Why not try data step ?
*/
Data class;
input Cust_ID weight age;
cards;
1 59 20
2 61 25
3 95 40
4 92 25
5 51 18
6 92 21
7 50 43
8 60 23
9 70 71
10 92 62
;
Run;
proc sql;
select *
from (select * from class except select * from class having weight= max(weight))
having weight= max(weight)
;quit;
A data step can pass through the data once to establish the MAX and 2ND highest values. It can then re-read the data and keep those obs that match the second.
data want (drop=_:);
set have (in=firstpass) have (in=secondpass);
retain _2ND _MAX;
array w{2} _2ND _MAX;
if firstpass then do; /*Update array W, if needed*/
if weight^=_MAX and weight>_2ND then do;
_2ND=weight;
call sortn(of w{*});
end;
end;
if secondpass and weight=_2ND;
run;
As proposed by @PaigeMiller
data have;
input Cust_ID weight age;
datalines;
1 59 20
2 61 25
3 95 40
4 92 25
5 51 18
6 92 21
7 50 43
8 60 23
9 70 71
10 92 62
;
proc rank data=have ties=dense descending out=want(where=(rank=2));
ranks rank;
var weight;
run;
proc print data=want;
run;
In the long run, I think SAS users are better off using built-in SAS capabilities (like PROC RANK) rather than writing their own DATA step code or SQL code to achieve the same thing. SAS has done the hard job of writing this code, making it robust to different possible problems including handling of missing values, debugging their code and testing it, and then SAS's code is proved to work in many real-world applications. In addition, if the problem changes to keeping the 5th largest ... trivial changes are needed to the PROC RANK code; I'm not sure the changes are trivial to DATA step or SQL solutions.
@PaigeMiller wrote:
In the long run, I think SAS users are better off using built-in SAS capabilities (like PROC RANK) rather than writing their own DATA step code or SQL code
...
In addition, if the problem changes to keeping the 5th largest ... trivial changes are needed to the PROC RANK code; I'm not sure the changes are trivial to DATA step or SQL solutions.
In general (and in this case), I agree with this sentiment. But with my clients/customers, I often find it's a good idea to anticipate a sort of mission creep, where the initial simple request (as in this topic) is expanded to include other objectives that won't be satisfied by the purpose-built SAS PROC.
In this particular case, changing the problem to the N'th largest would need only a trivial modification of the DATA step solution, but (I believe) a real headache for the PROC SQL approach.
Yes, "mission creep" can affect the programming that you do, its often a good thing to plan for it. On the other hand, for programmers who just want to learn how to do some task in SAS, that's not an issue.
Join us for SAS Innovate 2025, our biggest and most exciting global event of the year, in Orlando, FL, from May 6-9. Sign up by March 14 for just $795.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.