I have the following data
ID Name start_date Salary Bonus Status
1 Jon 06-21-2022 100 10 A
1 Jon 06-21-2022 100 10 E
2 Joe 06-21-2022 100 10 A
2 Joe 06-22-2022 100 10 A
2 Joe 06-22-2022 100 10 E
3 Ron 06-22-2022 100 10 A
3 Ron 06-22-2022 105 10 A
4 Tom 06-21-2022 100 10 E
4 Tom 06-22-2022 100 10 E
5 Sam 06-21-2022 100 10 E
5 Sam 06-22-2022 100 10 E
5 Sam 06-22-2022 100 10 E
I want to get the following result
ID Name start_date Salary Bonus Status
1 Jon 06-21-2022 100 10 A
2 Joe 06-21-2022 100 10 A
3 Ron 06-22-2022 105 10 A
4 Tom 06-21-2022 100 10 E
5 Sam 06-21-2022 105 10 E
here are the rules;
1. If no duplicate ID, return that record
2. If dupe ID, and one active status, return that record
3. if dupe id and multiple active statuses, return row with earliest start_date
4. if dupe id, multiple active statuses and multiple same start_dates, return the highest salary
5. if dupe id and expired statuses, return the earliest start_date
how can I achieve this with SAS code? any help/suggestion will be greatly appreciated.
Thanks
Lux
Try this:
proc sort data=have;
by id status start_date descending salary:
run;
data want;
set have;
by id;
if first.id;
run;
Untested, posted from my tablet.
Try this:
proc sort data=have;
by id status start_date descending salary:
run;
data want;
set have;
by id;
if first.id;
run;
Untested, posted from my tablet.
Great! thank you.
also how can i capture the rest of the data in a separate file?
TIA.
-Lux
That's easy in a data step. Create two datasets, and use conditional OUTPUTs:
data want1 want2;
set have;
by id;
if first.id
then output want1;
else output want2;
run;
Join us for SAS Innovate 2025, our biggest and most exciting global event of the year, in Orlando, FL, from May 6-9. Sign up by March 14 for just $795.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.