DATA Step, Macro, Functions and more

how to get 3rd highest salary in group wise using data step and proc sql

Reply
Frequent Contributor
Posts: 76

how to get 3rd highest salary in group wise using data step and proc sql

I have a dataset EMP

EMPNO   Salary   empcode

111          4000       MGR
112          6000       MGR

114          2000       MGR

115          8000       MGR

223          2000       clerk

226          1000       clerk

228           3000      clerk

300           500        peon

333           700        peon

345           300        peon

356           200        peon

320           700        peon

I want a dataset which contain 3rd highest salary in group wise empcode

EMPNO   Salary   empcode

111          4000       MGR

223          2000       clerk

345           300        peon

Please help me how to write a program in sas proc sql and data sep and using sas macro.

Thanks & Regards,

Ashwini

Super User
Posts: 9,681

Re: how to get 3rd highest salary in group wise using data step and proc sql

Here is a Data Step way.

PG may give you the SQL version.

data x;
input EMPNO   Salary   empcode $;
cards;
111          4000       MGR
112          6000       MGR
114          2000       MGR
115          8000       MGR
223          2000       clerk
226          1000       clerk
228           3000      clerk
300           500        peon
333           700        peon
345           300        peon
356           200        peon
320           700        peon
;
run;
proc sort data=x nodupkey; by empcode descending salary;run;
data want;
 set x;
 by empcode;
 if first.empcode then n=0;
 n+1;
 if n=3 then output;
run;

Ksharp

Super Contributor
Posts: 644

Re: how to get 3rd highest salary in group wise using data step and proc sql

What about if there are less than 3 representatives of a group?

Would you want the lowest salary out of 2 or the only salary represented in your dataset?

If so, no problem, modify Ksharp's code slightly

data want;

set x;

by empcode;

if first.empcode then n=0;

n+1;

if n=3

     or n<3 and last.empcode

        then output;

run;

Frequent Contributor
Posts: 76

Re: how to get 3rd highest salary in group wise using data step and proc sql

Thanks a lot for your kind information.

Could you pls explain me how it done in proc sql and sas macro

Super User
Posts: 5,083

Re: how to get 3rd highest salary in group wise using data step and proc sql

If a group has five people, and the top 3 all have the same salary, which one do you want to select?

I imagine the SQL gurus out there could select/remove the max, select/remove the max from the remainder, then select the max.

Why would you want to use macro language?

Good luck.

Regular Contributor
Posts: 151

Re: how to get 3rd highest salary in group wise using data step and proc sql

What happens if there is a tie for the 3rd highest?  If you want to return both records, then you need a different solution to that proposed.  One option is to run Proc Rank after the dedupe, keep the 3rd highest rank and then merge this back to the original table.  Example below with modified original data  (record 3 has had salary amended to 4,000).

With regards to sas macro, the only reason I can see a use for this is if you needed to change the ranking number (e.g. 4th highest instead of 3rd).

Note also that the solution from @KSharp deletes duplicate records from the source data as there is no 'out=' option!

data x;

input empno salary empcode $;

cards;

111          4000       MGR

112          6000       MGR

114          4000       MGR

115          8000       MGR

223          2000       clerk

226          1000       clerk

228           3000      clerk

300           700        peon

333           700        peon

345           300        peon

356           200        peon

320           700        peon

;

run;

proc sort data=x nodupkey out=dedupe_x;

by empcode descending salary;

run;

proc rank data=dedupe_x out=ranked_x (where=(salary_rank=3)) descending;

by empcode;

var salary;

ranks salary_rank;

run;

proc sql;

create table want as select

a.*

from x as a inner join

  ranked_x as b on a.empcode=b.empcode and a.salary=b.salary;

quit;

Super User
Posts: 5,083

Re: how to get 3rd highest salary in group wise using data step and proc sql

Keith,

As you noted, there is no valid solution possible unless there is a decision made about how to handle ties.  Even PROC RANK can't necessarily get around that fact, as there will be some ranking methods that produce 0 records having salary_rank=3.  A double DOW would be most flexible, since it allows different counting/ranking methods:

data want;

   do until (last.empcode);

        set have;

        by empcode descending salary;

        * Various flexible counting/ranking methods are possible here to select the target salary;

   end;

   do until (last.empcode);

       set have;

       by empcode descending salary;

       * Various outputting methods are available here, depending on whether tied records should be output;

   end;

run;


But the OP has some decisions to make before the programming can be done.

Respected Advisor
Posts: 3,124

Re: how to get 3rd highest salary in group wise using data step and proc sql

Here is a hard-coding low efficient SQL approach:

data have;

input EMPNO$ Salary empcode$;

cards;

111 4000 MGR

112 6000 MGR

114 2000 MGR

115 8000 MGR

223 2000 clerk

226 1000 clerk

228 3000 clerk

300 500 peon

333 700 peon

345 300 peon

356 200 peon

320 700 peon

;

proc sql;

select * from

(select * from (select * from have group by empcode having salary ne max(salary))

  group by empcode having salary ne max(salary))

  group by empcode having salary eq max(salary);

quit;

Haikuo

Respected Advisor
Posts: 3,124

Re: how to get 3rd highest salary in group wise using data step and proc sql

Here is a not so hard-coding SQL approach:

data have;

input EMPNO$ Salary empcode$;

cards;

111 4000 MGR

112 6000 MGR

114 2000 MGR

115 8000 MGR

223 2000 clerk

226 1000 clerk

228 3000 clerk

300 500 peon

333 700 peon

345 300 peon

356 200 peon

320 700 peon

;

proc sql;

select * from have a1 ,

(select empcode,salary from

  (select distinct b.empcode,b.salary, a.salary as _salary from have a

  left join have b

  on a.empcode=b.empcode)

group by empcode,salary

having sum(_salary>=salary)=3) b1

where a1.empcode=b1.empcode and a1.salary=b1.salary;

quit;

Haikuo

Ask a Question
Discussion stats
  • 8 replies
  • 9751 views
  • 0 likes
  • 6 in conversation