Hello!
I am working with a dataset that has employee start and end dates. I want to collapse this employ table into one current row and overwrite the beginning date with the first date provided of the year in the begin_date column
Here's a picture of what I'd like to do:
I am working with over a million rows and would like to narrow it down to 2023, but hoping to be able to use this same code going into future years.
Can anyone suggest how to do this?
thank you!
%let year = 2023;
data want;
set have;
by employee_id;
retain _start;
if first.employee_id then _start = .;
if _start =. and year(begin_date) = &year. then _start = begin_date;
if last.employee_id;
if _start ne . then begin_date = _start;
drop _start;
run;
%let year = 2023;
data want;
set have;
by employee_id;
retain _start;
if first.employee_id then _start = .;
if _start =. and year(begin_date) = &year. then _start = begin_date;
if last.employee_id;
if _start ne . then begin_date = _start;
drop _start;
run;
thank you! this worked!!!
Can you explain what "narrow it down to 2023" means?
Do you want to only keep the observations where the end_date is greater than '01JAN2023'd ?
Or did you have some other interpretation in mind?
sure! I meant to only work with employee begin dates that started in 2023. Sorry for the confusion! I wanted to keep the most recent observation, it has an end date of 31-dec-99, but insert the earliest date that started in 2023 for the employee as the begin date for that row.
Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.