Re: Nth highest salary using Data Step function

animesh123 · Posted 08-23-2022 11:50 AM

How to find Nth highest salary using Data Step function?

Reeza · Posted 08-23-2022 12:09 PM

proc sort data=have;
by Salary;
run;

data highestNthSalary;
set have firstobs=N obs=N;
run;

data highestSalaries;
set have;
if _n_=1 then counter=1;
else counter+1;
if counter < N;
run;

ballardw · Posted 08-23-2022 12:56 PM

If all the values are on a single record use the Largest function:

The transpose is to get all the values of a variable on a single record. The default names will be Col1, Col2, ... ,Col19 (19 records in the set).

Use the "of Col:" to have all variables in a list to use in the function call in this case.

proc transpose data=sashelp.class out=trans;
var weight;
run;

data want;
   set trans;
   /* replace 5 with n of interest*/
   v = largest(5, of col:);
   keep v;
run;

More of an "exercise for the interested reader" but may be easier if the N is to be per some combination of other variables. Then sort and transpose by those may be simpler than trying to code through a bunch of counters.

PaigeMiller · Posted 08-23-2022 01:00 PM

@animesh123 wrote:

How to find Nth highest salary using Data Step function?

Why insist on doing it in a DATA step? Why not use the easiest method, which is PROC RANK, and this also has features for handling ties, which the data step method does not have (unless you do additional programming).

--
Paige Miller

Sajid01 · Posted 08-23-2022 02:15 PM

Typical SAS interview question one comes across.
While there could be better and more elegant ways of doing it, a simple solution for interview purpose is given below.
I am using second highest. That should serve the purpose.
Change as needed.

data pay;
input id name $ salary;
datalines;
101 Sonu 120000
102 Monu 150000
104 Tanu 100000
105 Manu 101000
run;
proc sort data=pay;
by descending Salary ;
run;
data _null_;
set pay;
if _n_=2 then do;
call symput("Second_highest_salary",Salary);
end;
run;
%put &=Second_highest_salary;

The log has the value of second highest salary

Kurt_Bremser · Posted 08-23-2022 04:43 PM

Define how you want to handle ties.

Maxims of Maximally Efficient SAS Programmers
How to convert datasets to data steps
The macro for direct download as ZIP
How to post code
Please vote for Provide Sequential Search Capability for Hash Objects
How to deal with locked files on UNIX

Sajid01 · Posted 08-23-2022 08:11 PM

Thanks for pointing out the case of ties. Updated the code to take care of the ties.
The question is about the salaries that can be taken care of by removing duplicate values.
The code and data are self explanatory.
I repeat what I have said earlier, there may be more elegant and better ways to solve this question, but to me this appears to be a simple approach..

%let nth=2;
data pay;
input id name $ salary;
datalines;
100 Teetu 150000
101 Sonu 120000
102 Monu 150000
104 Tanu 100000
105 Manu 101000
106 Bablu 150000
run;
proc sort data=pay nodupkey;
by descending Salary ;
run;

data _null_;
set pay;
if _n_= &nth.then do;
call symput("Nth_highest_salary",Salary);
end;
run;
%put &=Nth_highest_salary where &=nth.;

The log will have the

PeterClemmensen · Posted 08-24-2022 04:14 AM

data want;
   if _N_ = 1 then do;
      declare hash h (dataset : 'sashelp.baseball(obs=0)', ordered : 'd', multidata : 'Y'); 
      h.definekey('Salary');                                                      
      h.definedata(all : 'y');                                                         
      h.definedone();                                                                   
      declare hiter i('h');                                                           
   end;                                                                                 
																						
   do until (z);                                                             
      set sashelp.baseball end = z;                                                                                                                                       
      h.add();                                                                          
   end;                                                                                 
																						
   do _N_ = 1 by 1 while (i.next() = 0 & _N_ <= 5);                                    
      output;                                                                           
   end;                                                                                 
																						
   _N_ = i.first();                                                                    
   _N_ = i.prev();                                                                     
   h.clear();                                                                           
run;

The DATA to DATA Step Macro
Blog: SASnrd

Ksharp · Posted 08-24-2022 08:17 AM

data have;
 set sashelp.heart(keep=weight rename=(weight=salary));
run;


%let n=4;
data _null_;
 set have end=last;
 array x{999999} _temporary_;
 x{_n_}=salary;
 if last then do; 
  largest_&n=largest(&n,of x{*}) ; 
  put largest_&n=;
 end;
run;

FreelanceReinh · Posted 08-24-2022 09:37 AM

%let n=4;
data _null_;
array x[&n] _temporary_ (&n*0);
retain m 0;
set have end=last;
if salary>m then do;
  x[whichn(m, of x[*])]=salary;
  m=min(of x[*]);
end;
if last then put "&n.th largest = " m;
run;

SAS Training: Just a Click Away