Hi everyone,
I have a problem assigning proper dates for different ids.
Please help me out. I really appreciate it.
For different ids, if the enter_date is the same as the previous leave_date or within the previous leave_date + 30 days, consider it as re-enter, then use the first enter_date as the enter_date, and the last leave_date as leave_date.
The following is the original data and the data I would like it to look like. The data has been sorted.
have | |||
id | enter_date | leave_date | service_date |
1 | 1/21/19 | 1/27/19 | 1/21/19 |
1 | 2/25/19 | 3/9/19 | 2/25/19 |
1 | 4/1/19 | 4/2/19 | 4/1/19 |
1 | 5/1/19 | 5/5/19 | 5/1/19 |
2 | 3/1/18 | 3/4/18 | 3/1/18 |
2 | 3/10/18 | 6/10/18 | 3/7/18 |
3 | 6/7/18 | 7/5/18 | 6/2/18 |
3 | 7/3/19 | 7/9/19 | 7/3/19 |
3 | 8/11/19 | 8/30/19 | 8/11/19 |
3 | 9/1/19 | 9/5/19 | 9/1/19 |
4 | 1/31/18 | 2/14/18 | 1/31/18 |
4 | 2/14/18 | 5/14/18 | 2/14/18 |
4 | 6/1/18 | 6/7/18 | 6/1/18 |
5 | 11/12/18 | 12/12/18 | 11/11/18 |
5 | 12/28/18 | 2/1/19 | 12/28/18 |
6 | 1/1/19 | 2/1/19 | 1/1/19 |
7 | 2/21/18 | 2/24/18 | 2/21/18 |
7 | 2/27/18 | 3/3/18 | 2/26/18 |
7 | 10/10/18 | 10/23/18 | 10/10/18 |
7 | 11/1/18 | 12/1/18 | 11/1/18 |
want | |||
id | enter_date | leave_date | service_date |
1 | 1/21/19 | 5/5/19 | 1/21/19 |
1 | 1/21/19 | 5/5/19 | 2/25/19 |
1 | 1/21/19 | 5/5/19 | 4/1/19 |
1 | 1/21/19 | 5/5/19 | 5/1/19 |
2 | 3/1/18 | 6/10/18 | 3/1/18 |
2 | 3/1/18 | 6/10/18 | 3/7/18 |
3 | 6/7/18 | 7/5/18 | 6/2/18 |
3 | 7/3/19 | 9/5/19 | 7/3/19 |
3 | 7/3/19 | 9/5/19 | 8/11/19 |
3 | 7/3/19 | 9/5/19 | 9/1/19 |
4 | 1/31/18 | 6/7/18 | 1/31/18 |
4 | 1/31/18 | 6/7/18 | 2/14/18 |
4 | 1/31/18 | 6/7/18 | 6/1/18 |
5 | 11/12/18 | 2/1/19 | 11/11/18 |
5 | 11/12/18 | 2/1/19 | 12/28/18 |
6 | 1/1/19 | 2/1/19 | 1/1/19 |
7 | 2/21/18 | 3/3/18 | 2/21/18 |
7 | 2/21/18 | 3/3/18 | 2/26/18 |
7 | 10/10/18 | 12/1/18 | 10/10/18 |
7 | 10/10/18 | 12/1/18 | 11/1/18 |
Thank you very much for taking the time to help me out!
Amber
data have;
infile datalines ;
input id (enter_date leave_date service_date) ( :mmddyy9.);
format enter_date -- service_date date9.;
datalines;
1 1/21/19 1/27/19 1/21/19
1 2/25/19 3/9/19 2/25/19
1 4/1/19 4/2/19 4/1/19
1 5/1/19 5/5/19 5/1/19
2 3/1/18 3/4/18 3/1/18
2 3/10/18 6/10/18 3/7/18
3 6/7/18 7/5/18 6/2/18
3 7/3/19 7/9/19 7/3/19
3 8/11/19 8/30/19 8/11/19
3 9/1/19 9/5/19 9/1/19
4 1/31/18 2/14/18 1/31/18
4 2/14/18 5/14/18 2/14/18
4 6/1/18 6/7/18 6/1/18
5 11/12/18 12/12/18 11/11/18
5 12/28/18 2/1/19 12/28/18
6 1/1/19 2/1/19 1/1/19
7 2/21/18 2/24/18 2/21/18
7 2/27/18 3/3/18 2/26/18
7 10/10/18 10/23/18 10/10/18
7 11/1/18 12/1/18 11/1/18
;
run;
data temp;
set have;
by id;
if first.id or
lag(leave_date)>enter_date or
enter_date>lag(leave_date)+30 then group+1;
run;
proc sql;
create table want as
select *,min(enter_date) as new_enter_date format=date9.,
max(leave_date) as new_leave_date format=date9.
from temp
group by group;
quit;
The first row of your sample data has leave_date=1/21/2019, and the second row has enter_date=2/27/2019. That's more than a 30-day difference - it's 31 days. But you kept the record with the remaining ID=1 records. Was that intentional?
Thanks for pointing it out. It’s a typo I have corrected it.
As you read a set of records, you need to compare the current-record LEAVE_DATE to the next record's ENTER_DATE. You can preserve the first ENTER_DATE in an extra variable (_ENTER1). If the gap is greater than 30 days, or you are at the end of an ID group, then you can re-read the same records (without re-reading LEAVE_DATE or ENTER_DATE), and output those records.
You don't need to re-read ENTER_DATE because you can replace it with _ENTER1. And you don't need to re-read LEAVE_DATE because the LEAVE_DATE at the end of the first pass is the one to keep:
data have;
infile datalines ;
input id (enter_date leave_date service_date) ( :mmddyy9.);
format enter_date -- service_date date9.;
datalines;
1 1/21/19 1/27/19 1/21/19
1 2/25/19 3/9/19 2/25/19
1 4/1/19 4/2/19 4/1/19
1 5/1/19 5/5/19 5/1/19
2 3/1/18 3/4/18 3/1/18
2 3/10/18 6/10/18 3/7/18
3 6/7/18 7/5/18 6/2/18
3 7/3/19 7/9/19 7/3/19
3 8/11/19 8/30/19 8/11/19
3 9/1/19 9/5/19 9/1/19
4 1/31/18 2/14/18 1/31/18
4 2/14/18 5/14/18 2/14/18
4 6/1/18 6/7/18 6/1/18
5 11/12/18 12/12/18 11/11/18
5 12/28/18 2/1/19 12/28/18
6 1/1/19 2/1/19 1/1/19
7 2/21/18 2/24/18 2/21/18
7 2/27/18 3/3/18 2/26/18
7 10/10/18 10/23/18 10/10/18
7 11/1/18 12/1/18 11/1/18
run;
data want (drop=_:);
do _n=1 by 1 until (last.id=1 or _nxt_enter > leave_date + 30);
set have (keep=id);
by id;
merge have
have (firstobs=2 keep=enter_date rename=(enter_date=_nxt_enter));
if _n=1 then _enter1=enter_date;
end;
enter_date=_enter1;
do _i=1 to _n;
set have (drop=enter_date leave_date);
output;
end;
run;
Thank you so much for such a detailed answer! The logic is absolutely correct, but I don't know why it also change the dates of the records which are not qualified as re-enter. Basically it changes the records to the same enter_date and leave_date for different ids .
data have;
infile datalines ;
input id (enter_date leave_date service_date) ( :mmddyy9.);
format enter_date -- service_date date9.;
datalines;
1 1/21/19 1/27/19 1/21/19
1 2/25/19 3/9/19 2/25/19
1 4/1/19 4/2/19 4/1/19
1 5/1/19 5/5/19 5/1/19
2 3/1/18 3/4/18 3/1/18
2 3/10/18 6/10/18 3/7/18
3 6/7/18 7/5/18 6/2/18
3 7/3/19 7/9/19 7/3/19
3 8/11/19 8/30/19 8/11/19
3 9/1/19 9/5/19 9/1/19
4 1/31/18 2/14/18 1/31/18
4 2/14/18 5/14/18 2/14/18
4 6/1/18 6/7/18 6/1/18
5 11/12/18 12/12/18 11/11/18
5 12/28/18 2/1/19 12/28/18
6 1/1/19 2/1/19 1/1/19
7 2/21/18 2/24/18 2/21/18
7 2/27/18 3/3/18 2/26/18
7 10/10/18 10/23/18 10/10/18
7 11/1/18 12/1/18 11/1/18
;
run;
data temp;
set have;
by id;
if first.id or
lag(leave_date)>enter_date or
enter_date>lag(leave_date)+30 then group+1;
run;
proc sql;
create table want as
select *,min(enter_date) as new_enter_date format=date9.,
max(leave_date) as new_leave_date format=date9.
from temp
group by group;
quit;
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.