Hello all!
I have a dataset in which I would like to be able to find time periods that overlap for each id.
Below is an example of the start dataset:
id start_date end_date
1 3/1/2016 6/2/2016
1 4/1/2016 7/1/2016
1 8/2/2016 12/1/2016
1 1/2/2017 3/1/2017
2 5/10/2014 12/1/2014
2 7/10/2014 10/1/2014
2 1/1/2015 3/2/2015
I want to be able to identify the rows for each id with the overlapping periods, like below:
id start_date end_date
1 3/1/2016 6/2/2016
1 4/1/2016 7/1/2016
2 5/10/2014 12/1/2014
2 7/10/2014 10/1/2014
Can anyone guide me on this?
Thank you!
there is discussion of overlap here: http://www2.sas.com/proceedings/sugi31/048-31.pdf
they use arrays
A SQL self-join makes this sort of problem often quite simple.
data have;
infile datalines dlm=' ' truncover;
input id start_date:mmddyy. end_date:mmddyy.;
format start_date end_date date9.;
datalines;
1 3/1/2016 6/2/2016
1 4/1/2016 7/1/2016
1 8/2/2016 12/1/2016
1 1/2/2017 3/1/2017
2 5/10/2014 12/1/2014
2 7/10/2014 10/1/2014
2 1/1/2015 3/2/2015
;
run;
proc sql;
create table want as
select
l.*
from have l, have r
where
/* test only within same id */
l.id=r.id
/* do not join a record with itself */
and not (l.start_date=r.start_date and l.end_date=r.end_date)
/* find overlapping dates */
and
(
l.start_date between r.start_date and r.end_date
or
r.start_date between l.start_date and l.end_date
)
;
quit;
data have;
infile datalines dlm=' ' truncover;
input id start_date:mmddyy. end_date:mmddyy.;
format start_date end_date date9.;
datalines;
1 3/1/2016 6/2/2016
1 4/1/2016 7/1/2016
1 8/2/2016 12/1/2016
1 1/2/2017 3/1/2017
2 5/10/2014 12/1/2014
2 7/10/2014 10/1/2014
2 1/1/2015 3/2/2015
;
run;
data temp;
set have;
by id;
if start_date <= lag(end_date) then flag=1;
if first.id then flag=.;
run;
data temp;
set temp;
if missing(flag) then group+1;
run;
proc sql;
create table want as
select *
from temp
group by group
having sum(flag) > 0;
quit;
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.