Hello,
I have data with the duration of the contract. However, some data is unstructured and date contracts overlap. In these cases, I would like the term of the contract to be in line with other contracts. Below is an example of what the data looks like:
ID | START_DATE | STOP_DATE |
1 | 2019-01-01 | 2019-12-31 |
1 | 2020-01-01 | 2020-12-31 |
2 | 2020-01-01 | 2020-10-31 |
2 | 2020-01-07 | 2021-12-31 |
2 | 2020-09-01 | 2022-12-31 |
ID 1 is shown as valid. I would like to transform ID 2 into the following form:
ID | START_DATE | STOP_DATE |
1 | 2019-01-01 | 2019-12-31 |
1 | 2020-01-01 | 2020-12-31 |
2 | 2020-01-01 | 2020-01-06 |
2 | 2020-01-07 | 2020-08-31 |
2 | 2020-09-01 | 2022-12-31 |
I know the basics of SAS and I don't know how to solve this task. Can You help me?
Here is a method for looking ahead to see the value of START_DATE on the next observation.
data want ;
set have ;
by id start_date stop_date ;
set have(keep=start_date rename=(start_date=next_date) firstobs=2)
have(obs=1 drop=_all_)
;
if last.id then next_date=.;
new_stop = min(stop_date,next_date-1);
format new_stop yymmdd10.;
run;
Obs ID START_DATE STOP_DATE next_date new_stop 1 1 2019-01-01 2019-12-31 2020-01-01 2019-12-31 2 1 2020-01-01 2020-12-31 . 2020-12-31 3 2 2020-01-01 2020-10-31 2020-01-07 2020-01-06 4 2 2020-01-07 2021-12-31 2020-09-01 2020-08-31 5 2 2020-09-01 2022-12-31 . 2022-12-31
Can you please explain your business logic in more detail?
You want to modify the current stop_date if the following start_date overlaps:
data want (drop=nxt_:) ;
set have (keep=id);
by id;
merge have have (firstobs=2 keep=start_date rename=(start_date=nxt_start));
if last.id=0 then stop_date=min(stop_date,nxt_start-1);
run;
Here is a method for looking ahead to see the value of START_DATE on the next observation.
data want ;
set have ;
by id start_date stop_date ;
set have(keep=start_date rename=(start_date=next_date) firstobs=2)
have(obs=1 drop=_all_)
;
if last.id then next_date=.;
new_stop = min(stop_date,next_date-1);
format new_stop yymmdd10.;
run;
Obs ID START_DATE STOP_DATE next_date new_stop 1 1 2019-01-01 2019-12-31 2020-01-01 2019-12-31 2 1 2020-01-01 2020-12-31 . 2020-12-31 3 2 2020-01-01 2020-10-31 2020-01-07 2020-01-06 4 2 2020-01-07 2021-12-31 2020-09-01 2020-08-31 5 2 2020-09-01 2022-12-31 . 2022-12-31
Next code is tested:
data have;
infile cards dlm = '09'x truncover;
input ID Start_date yymmdd10. @+1 Stop_date yymmdd10.;
format start_date stop_date yymmdd10.;
cards;
1 2019-01-01 2019-12-31
1 2020-01-01 2020-12-31
2 2020-01-01 2020-10-31
2 2020-01-07 2021-12-31
2 2020-09-01 2022-12-31
; run;
proc sort data=have;
by ID descending start_date descending stop_date;
run;
data want;
set have;
by ID;
prev_start = lag(start_date);
if not first.ID and
stop_date > prev_start
then stop_date = prev_start -1;
drop prev_start;
run;
proc sort data=want;
by ID start_date stop_date;
run;
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.