BookmarkSubscribeRSS Feed
anweinbe
Quartz | Level 8

I have data that I sorted by Company Year and Salary Descending.

I want to extract only the top 2 salary records for each company and year to a new table.

 

My data looks like:

Company     Year         Salary

1                 2020          10,000

1                 2020           5,000

1                 2020           3,000

1                 2020           2,000

1                 2019           9,000

1                 2019           5,000

1                 2019           3,000

2                 2020           8,000

2                 2020           5,000

2                 2020           3,000

 

My desired outcome is my new table will only have the records for 10,000 and 5,000 for company 1 2020, 9,000 and 5,000 for company 1 for 2019, and 8,000 and 5,000 for company 2 in 2020

 

I tried the following:

proc sort data=temp.own2_for_sample3; by company Year descending salary;
run;

data temp.own2_for_sample3; set temp.own2_for_sample3;
by company Year salary;
if first.year then rnk=0;
rnk +1;
run;

 

It appears to work.... however I am afraid that since I'm only doing first.year and not including the first.company it could mess up. Any easy modifications?

 

Thank you!

7 REPLIES 7
novinosrin
Tourmaline | Level 20
data want;
 do _n_=1 by 1 until(last.year);
  set have;
  by company  year;
  if _n_<=2 then output;
 end;
run;
novinosrin
Tourmaline | Level 20

Actually your solution is neat and simple

anweinbe
Quartz | Level 8

To solve for my issue... do you think if I do If last.year OR if last.company then rnk=0?

novinosrin
Tourmaline | Level 20

 I modified yours-

proc sort data=temp.own2_for_sample3; 
by company Year descending salary;
run;

data temp.own2_for_sample3; 
set temp.own2_for_sample3;
by company Year ;
if first.year then rnk=0;
rnk +1;
if rnk<=2;
run;
anweinbe
Quartz | Level 8

My worry is that I could have a situation where the company changes but not the year.

 

Would this fix that worry? I think it does....

if first._year or first.company then rnk=0;

 

 

novinosrin
Tourmaline | Level 20

Basically you could have-

 

1. N companies

2. You need to understand the SORT mechanics

3. when you sort by Company year , you end up having unique combinations of company and year forming its own group

4. Your descending Salary would order the salaries for each unique group of company-year.

5. Now, You want to write the 1st 2 records for each of these unique Company-year combination.

6. Your rnk counter safely increments for each record of the company-year combination

7. All you need is to apply a filter to restrict the resuts to a count of 2 or less i.e. <=2 and then you should have your results?

Does the above make sense?

mkeintz
PROC Star

You know about first.year, so consider also using lag(first.year):

 

proc sort data=temp.own2_for_sample3;
  by company Year descending salary;
run;

data temp.own2_for_sample3; 
   set temp.own2_for_sample3;
   by company Year ;
   if first.year or lag(first.year);
   rnk=first.year+2*lag(first.year);  /* Editted in the "2*" factor */
run;
--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------

SAS Innovate 2025: Register Now

Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 7 replies
  • 936 views
  • 0 likes
  • 3 in conversation