Hello,
I have data with the duration of the contract. However, some data is unstructured and date contracts overlap. In these cases, I would like the term of the contract to be in line with other contracts. Below is an example of what the data looks like:
ID | START_DATE | STOP_DATE |
1 | 2019-01-01 | 2019-12-31 |
1 | 2020-01-01 | 2020-12-31 |
2 | 2020-01-01 | 2020-10-31 |
2 | 2020-01-07 | 2021-12-31 |
2 | 2020-09-01 | 2022-12-31 |
ID 1 is shown as valid. I would like to transform ID 2 into the following form:
ID | START_DATE | STOP_DATE |
1 | 2019-01-01 | 2019-12-31 |
1 | 2020-01-01 | 2020-12-31 |
2 | 2020-01-01 | 2020-01-06 |
2 | 2020-01-07 | 2020-08-31 |
2 | 2020-09-01 | 2022-12-31 |
I know the basics of SAS and I don't know how to solve this task. Can You help me?
Here is a method for looking ahead to see the value of START_DATE on the next observation.
data want ;
set have ;
by id start_date stop_date ;
set have(keep=start_date rename=(start_date=next_date) firstobs=2)
have(obs=1 drop=_all_)
;
if last.id then next_date=.;
new_stop = min(stop_date,next_date-1);
format new_stop yymmdd10.;
run;
Obs ID START_DATE STOP_DATE next_date new_stop 1 1 2019-01-01 2019-12-31 2020-01-01 2019-12-31 2 1 2020-01-01 2020-12-31 . 2020-12-31 3 2 2020-01-01 2020-10-31 2020-01-07 2020-01-06 4 2 2020-01-07 2021-12-31 2020-09-01 2020-08-31 5 2 2020-09-01 2022-12-31 . 2022-12-31
Can you please explain your business logic in more detail?
You want to modify the current stop_date if the following start_date overlaps:
data want (drop=nxt_:) ;
set have (keep=id);
by id;
merge have have (firstobs=2 keep=start_date rename=(start_date=nxt_start));
if last.id=0 then stop_date=min(stop_date,nxt_start-1);
run;
Here is a method for looking ahead to see the value of START_DATE on the next observation.
data want ;
set have ;
by id start_date stop_date ;
set have(keep=start_date rename=(start_date=next_date) firstobs=2)
have(obs=1 drop=_all_)
;
if last.id then next_date=.;
new_stop = min(stop_date,next_date-1);
format new_stop yymmdd10.;
run;
Obs ID START_DATE STOP_DATE next_date new_stop 1 1 2019-01-01 2019-12-31 2020-01-01 2019-12-31 2 1 2020-01-01 2020-12-31 . 2020-12-31 3 2 2020-01-01 2020-10-31 2020-01-07 2020-01-06 4 2 2020-01-07 2021-12-31 2020-09-01 2020-08-31 5 2 2020-09-01 2022-12-31 . 2022-12-31
Next code is tested:
data have;
infile cards dlm = '09'x truncover;
input ID Start_date yymmdd10. @+1 Stop_date yymmdd10.;
format start_date stop_date yymmdd10.;
cards;
1 2019-01-01 2019-12-31
1 2020-01-01 2020-12-31
2 2020-01-01 2020-10-31
2 2020-01-07 2021-12-31
2 2020-09-01 2022-12-31
; run;
proc sort data=have;
by ID descending start_date descending stop_date;
run;
data want;
set have;
by ID;
prev_start = lag(start_date);
if not first.ID and
stop_date > prev_start
then stop_date = prev_start -1;
drop prev_start;
run;
proc sort data=want;
by ID start_date stop_date;
run;
Available on demand!
Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Select SAS Training centers are offering in-person courses. View upcoming courses for: