I have the following data
ID Name start_date Salary Bonus Status
1 Jon 06-21-2022 100 10 A
1 Jon 06-21-2022 100 10 E
2 Joe 06-21-2022 100 10 A
2 Joe 06-22-2022 100 10 A
2 Joe 06-22-2022 100 10 E
3 Ron 06-22-2022 100 10 A
3 Ron 06-22-2022 105 10 A
4 Tom 06-21-2022 100 10 E
4 Tom 06-22-2022 100 10 E
5 Sam 06-21-2022 100 10 E
5 Sam 06-22-2022 100 10 E
5 Sam 06-22-2022 100 10 E
I want to get the following result
ID Name start_date Salary Bonus Status
1 Jon 06-21-2022 100 10 A
2 Joe 06-21-2022 100 10 A
3 Ron 06-22-2022 105 10 A
4 Tom 06-21-2022 100 10 E
5 Sam 06-21-2022 105 10 E
here are the rules;
1. If no duplicate ID, return that record
2. If dupe ID, and one active status, return that record
3. if dupe id and multiple active statuses, return row with earliest start_date
4. if dupe id, multiple active statuses and multiple same start_dates, return the highest salary
5. if dupe id and expired statuses, return the earliest start_date
how can I achieve this with SAS code? any help/suggestion will be greatly appreciated.
Thanks
Lux
Try this:
proc sort data=have;
by id status start_date descending salary:
run;
data want;
set have;
by id;
if first.id;
run;
Untested, posted from my tablet.
Try this:
proc sort data=have;
by id status start_date descending salary:
run;
data want;
set have;
by id;
if first.id;
run;
Untested, posted from my tablet.
Great! thank you.
also how can i capture the rest of the data in a separate file?
TIA.
-Lux
That's easy in a data step. Create two datasets, and use conditional OUTPUTs:
data want1 want2;
set have;
by id;
if first.id
then output want1;
else output want2;
run;
It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.