BookmarkSubscribeRSS Feed
pavank
Obsidian | Level 7
data s;
input id price;
datalines;
100 25
100 60
100 65
100 80
100 91
101 55
101 25
101 85
101 100
;
run;

/* proc sql method : Second max price for each id */
proc sql;
    select *
    from s as a
    where  2=(select distinct price
        from s as b
        where a.price <= b.price
        ) ;
quit;

/* Datastep method :Second max price for each id */

data rank;
set s;
run;

proc sort data=s;
by descending id;
run;

data ranking;
set s;
by  descending id ;
if first.id then ranks=1;
else ranks+1;
proc print noobs;
run;


data maxvalue ;
set ranking  ;
if _n_=2;
proc print ;
run;

Q1.)Here I didn't get correct output for nth max value as of now i want to find second max price for each id how to fix this problem

14 REPLIES 14
Kurt_Bremser
Super User

You want the second highest price per id, so you need to sort by id and descending price.

Then do

data want;
set have;
by id;
if first.id
then count = 1;
else count + 1;
if count = 2;
drop count;
run;
Patrick
Opal | Level 21

You could use Proc Rank

data have;
  input id price;
datalines;
100 25
100 25
100 60
100 60
100 65
100 80
100 91
101 55
101 25
101 85
101 100
;

proc sort data=have out=have_sorted;
  by id;
run;

proc rank data=have_sorted ties=dense out=want(where=(rank=2));
  by id;
  var price;
  ranks rank;
run;

proc print data=want;
run;

Patrick_0-1701242475657.png

 

pavank
Obsidian | Level 7
But interviewer ask proc sql and datastep only not procedures now a days interviews ask most of secnario questions
PaigeMiller
Diamond | Level 26

@pavank wrote:
But interviewer ask proc sql and datastep only not procedures now a days interviews ask most of secnario questions

The correct answer to the interviewer is that you should not try to do this in PROC SQL when much simpler methods are available in SAS.

--
Paige Miller
s_lassen
Meteorite | Level 14

I do not quite understand what you are trying to do with the SQL code.

 

A data step solution can be done like this:

proc sort data=s;
  by id descending price;
run;

data ranks;
  do rank=1 by 1 until(last.id);
    set s;
    by id;
    output;
    end;
run;

An then it just to use "where rank=2" if that is the rank you want.

s_lassen
Meteorite | Level 14

SAS SQL is not really suited for this kind of question (other dialects have RANK summary functions, which can be used), so to get the second highest PRICE for each ID you will have to do something like

proc sql;
  select id,max(price) from(
    select * from s except
    select id,max(price) from s
    group by id)
  group by id;

in other words, subtract the rows with the highest values from the table in order to find the second highest. If you want to use SQL to find the third or fourth highest value, things begin to get complicated, with a larger number of nested queries.

 

Not that this solution will handle ties differently than the data step I wrote, as that will give different ranks to tied values - a data step that gives the same rank to ties can be written like this:

proc sort data=s;
  by id descending price;
run;

data ranks;
rank=0; do _N_=1 by 1 until(last.id); set s; by id; if _N_=1 or price ne lag(price) then rank+1; output; end; run;

But probably the most natural solution in SAS is to use PROC RANK, as suggested by @Patrick 

Ksharp
Super User

Here is for PROC SQL solution.But I would not advocate it.

data s;
input id price;
datalines;
100 25
100 60
100 65
100 80
100 91
101 55
101 25
101 85
101 100
;
run;

%let n=2;
proc sql;
create table want as
select a.*
 from s as a,s as b
  where a.id=b.id and a.price<=b.price
   group by a.id,a.price
     having count(*)<=&n.;
quit;

 

 

pavank
Obsidian | Level 7

Hi Sharp

Thanks for your solution

But interviewers ask reason why we can not achieved in proc sql can you explain how to explain proper reason 

Kurt_Bremser
Super User

SQL does not have tools for sequences (with the exception of the undocumented MONOTONIC function, which does not work as expected under certain circumstances).

Ksharp
Super User

As @Kurt_Bremser said PROC SQL is standard SQL (unlike ORACLE , DB2 ) can not handle the sequence records , the order of record does not honored by PROC SQL. That is why I do not recommend SQL for top nth MAX or MIN question . Date Step is right tool .

As you can see ,although I got solution by PROC SQL, that is still NOT a efficient way , if you have a big table , my PROC SQL would run forever .

PaigeMiller
Diamond | Level 26

I would say that PROC RANK ought to be preferred over a DATA step here. DATA steps require the user to create and then program the logic, not really that easy for most of us (even me). Whereas PROC RANK simply requires the normal information that most PROCs require: what data set, what variable(s), what output; the user does not have to create and then program the logic. Also, PROC RANK can handle ties; whereas handling ties in a DATA step requires the user to program additional logic.

--
Paige Miller
ballardw
Super User

Assuming the data set is amenable to transposing then Proc Transpose and Data step with the LARGEST function or  CALL SORTN on an array of the transposed values would work depending on exactly how the output is desired.

/* assuming Have is output of transposed data by Id */
data ranks;
   set have;
   array c(*) col: ;
   do rank=1 to n(of c(*));
      value = largest(rank,of c(*));
      output;
   end;
   keep id rank value;
run;

Skeleton of program for relatively "nice" input data and ignores any tie breaking.

 

But first ask the interviewer about which rules for tie-breaking are to be used for ranking tied values. Can't start the process without all the constraints.

 

 

 

mkeintz
PROC Star

If the data are already sorted by ID, but not necessarily by price within ID, then a single DATA step as below:

 

data s;
input id price;
datalines;
100 25
100 60
100 65
100 80
100 91
101 55
101 25
101 85
101 100
run;

%let rank=2;

data nth_largest (label="for N=&rank");
  set s;
  by id;
  array larg {&rank} _temporary_;
  if price>larg{1} then do;
    larg{1}=price;
    call sort(of larg{*});
  end;
  if last.id;
  price=larg{1};
  call missing(of larg{*});
run;

The array LARG, after the call sort, will be sorted from smallest (larg{1}) to largest (larg{&rank}).  Because the array is _temporary_, its values are retained across observations.

 

Additional note:

 

There has been some discussion of tied values.  In the code above, if there are duplicate instances of high values, each instance is treated as a separate value in the sort call  (i.e. for ID=100, if there were two instances of price=91, then 91 would be the 2nd largest, not 85).

 

But if you want to control for ties (i.e. you want the nth largest distinct value), just change

 

  if price>larg{1} then do;

to

  if price>larg{1} and whichn(price,of larg{*})=0 then do;

 

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------
Sajid01
Meteorite | Level 14

Hello
This is a common question in SAS interviews.
This is my solution. If there are duplicate values for the nth position, reports them.
May not be the most elegant or efficient approach but it does works for the interview.

 

data s;
input Id Price;
datalines;
100 25
100 60
100 65
100 80
100 91
101 55
101 25
101 85
101 100
103 100
104 91
;
run;
%Let N=2;
PROC SQL;
Select Id, Price from s s1
where (&N.-1)=(select count(Distinct Price) from s s2 where s2.Price> s1.Price);
QUIT;

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 14 replies
  • 1963 views
  • 10 likes
  • 9 in conversation