BookmarkSubscribeRSS Feed
Ashwini
Calcite | Level 5

I have a dataset EMP

EMPNO   Salary   empcode

111          4000       MGR
112          6000       MGR

114          2000       MGR

115          8000       MGR

223          2000       clerk

226          1000       clerk

228           3000      clerk

300           500        peon

333           700        peon

345           300        peon

356           200        peon

320           700        peon

I want a dataset which contain 3rd highest salary in group wise empcode

EMPNO   Salary   empcode

111          4000       MGR

223          2000       clerk

345           300        peon

Please help me how to write a program in sas proc sql and data sep and using sas macro.

Thanks & Regards,

Ashwini

9 REPLIES 9
Ksharp
Super User

Here is a Data Step way.

PG may give you the SQL version.

data x;
input EMPNO   Salary   empcode $;
cards;
111          4000       MGR
112          6000       MGR
114          2000       MGR
115          8000       MGR
223          2000       clerk
226          1000       clerk
228           3000      clerk
300           500        peon
333           700        peon
345           300        peon
356           200        peon
320           700        peon
;
run;
proc sort data=x nodupkey; by empcode descending salary;run;
data want;
 set x;
 by empcode;
 if first.empcode then n=0;
 n+1;
 if n=3 then output;
run;

Ksharp

RichardinOz
Quartz | Level 8

What about if there are less than 3 representatives of a group?

Would you want the lowest salary out of 2 or the only salary represented in your dataset?

If so, no problem, modify Ksharp's code slightly

data want;

set x;

by empcode;

if first.empcode then n=0;

n+1;

if n=3

     or n<3 and last.empcode

        then output;

run;

Ashwini
Calcite | Level 5

Thanks a lot for your kind information.

Could you pls explain me how it done in proc sql and sas macro

Astounding
PROC Star

If a group has five people, and the top 3 all have the same salary, which one do you want to select?

I imagine the SQL gurus out there could select/remove the max, select/remove the max from the remainder, then select the max.

Why would you want to use macro language?

Good luck.

Keith
Obsidian | Level 7

What happens if there is a tie for the 3rd highest?  If you want to return both records, then you need a different solution to that proposed.  One option is to run Proc Rank after the dedupe, keep the 3rd highest rank and then merge this back to the original table.  Example below with modified original data  (record 3 has had salary amended to 4,000).

With regards to sas macro, the only reason I can see a use for this is if you needed to change the ranking number (e.g. 4th highest instead of 3rd).

Note also that the solution from @KSharp deletes duplicate records from the source data as there is no 'out=' option!

data x;

input empno salary empcode $;

cards;

111          4000       MGR

112          6000       MGR

114          4000       MGR

115          8000       MGR

223          2000       clerk

226          1000       clerk

228           3000      clerk

300           700        peon

333           700        peon

345           300        peon

356           200        peon

320           700        peon

;

run;

proc sort data=x nodupkey out=dedupe_x;

by empcode descending salary;

run;

proc rank data=dedupe_x out=ranked_x (where=(salary_rank=3)) descending;

by empcode;

var salary;

ranks salary_rank;

run;

proc sql;

create table want as select

a.*

from x as a inner join

  ranked_x as b on a.empcode=b.empcode and a.salary=b.salary;

quit;

Astounding
PROC Star

Keith,

As you noted, there is no valid solution possible unless there is a decision made about how to handle ties.  Even PROC RANK can't necessarily get around that fact, as there will be some ranking methods that produce 0 records having salary_rank=3.  A double DOW would be most flexible, since it allows different counting/ranking methods:

data want;

   do until (last.empcode);

        set have;

        by empcode descending salary;

        * Various flexible counting/ranking methods are possible here to select the target salary;

   end;

   do until (last.empcode);

       set have;

       by empcode descending salary;

       * Various outputting methods are available here, depending on whether tied records should be output;

   end;

run;


But the OP has some decisions to make before the programming can be done.

Haikuo
Onyx | Level 15

Here is a hard-coding low efficient SQL approach:

data have;

input EMPNO$ Salary empcode$;

cards;

111 4000 MGR

112 6000 MGR

114 2000 MGR

115 8000 MGR

223 2000 clerk

226 1000 clerk

228 3000 clerk

300 500 peon

333 700 peon

345 300 peon

356 200 peon

320 700 peon

;

proc sql;

select * from

(select * from (select * from have group by empcode having salary ne max(salary))

  group by empcode having salary ne max(salary))

  group by empcode having salary eq max(salary);

quit;

Haikuo

Haikuo
Onyx | Level 15

Here is a not so hard-coding SQL approach:

data have;

input EMPNO$ Salary empcode$;

cards;

111 4000 MGR

112 6000 MGR

114 2000 MGR

115 8000 MGR

223 2000 clerk

226 1000 clerk

228 3000 clerk

300 500 peon

333 700 peon

345 300 peon

356 200 peon

320 700 peon

;

proc sql;

select * from have a1 ,

(select empcode,salary from

  (select distinct b.empcode,b.salary, a.salary as _salary from have a

  left join have b

  on a.empcode=b.empcode)

group by empcode,salary

having sum(_salary>=salary)=3) b1

where a1.empcode=b1.empcode and a1.salary=b1.salary;

quit;

Haikuo

Zhanxiong
Obsidian | Level 7

The SQL solution, though probably not the most efficient one, is quite elegant in my opinion:

 

proc sql;
    select 
        *
    from 
        EMP as e1
    where
        2 = (select 
                 count(distinct e2.Salary)
            from
                 EMP as e2
            where
                 e1.empcode = e2.empcode and e2.Salary > e1.Salary);
quit;

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 9 replies
  • 25300 views
  • 1 like
  • 7 in conversation