BookmarkSubscribeRSS Feed
SarahW13
Obsidian | Level 7

Hello all!

 

I have a dataset in which I would like to be able to find time periods that overlap for each id.

 

Below is an example of the start dataset:

 

id      start_date       end_date

1       3/1/2016        6/2/2016

1       4/1/2016        7/1/2016

1       8/2/2016        12/1/2016

1       1/2/2017        3/1/2017

2       5/10/2014      12/1/2014

2       7/10/2014      10/1/2014

2       1/1/2015        3/2/2015

 

I want to be able to identify the rows for each id with the overlapping periods, like below:

 

id      start_date       end_date

1       3/1/2016        6/2/2016

1       4/1/2016        7/1/2016

 

2       5/10/2014      12/1/2014

2       7/10/2014      10/1/2014

 

Can anyone guide me on this?

Thank you!

3 REPLIES 3
pau13rown
Lapis Lazuli | Level 10

there is discussion of overlap here: http://www2.sas.com/proceedings/sugi31/048-31.pdf

they use arrays

Patrick
Opal | Level 21

A SQL self-join makes this sort of problem often quite simple.

data have;
  infile datalines dlm=' ' truncover;
  input id start_date:mmddyy. end_date:mmddyy.;
  format start_date end_date date9.;
  datalines;
1 3/1/2016 6/2/2016
1 4/1/2016 7/1/2016
1 8/2/2016 12/1/2016
1 1/2/2017 3/1/2017
2 5/10/2014 12/1/2014
2 7/10/2014 10/1/2014
2 1/1/2015 3/2/2015
;
run;

proc sql;
create table want as select l.* from have l, have r where /* test only within same id */ l.id=r.id /* do not join a record with itself */ and not (l.start_date=r.start_date and l.end_date=r.end_date) /* find overlapping dates */ and ( l.start_date between r.start_date and r.end_date or r.start_date between l.start_date and l.end_date ) ; quit;
Ksharp
Super User
data have;
  infile datalines dlm=' ' truncover;
  input id start_date:mmddyy. end_date:mmddyy.;
  format start_date end_date date9.;
  datalines;
1 3/1/2016 6/2/2016
1 4/1/2016 7/1/2016
1 8/2/2016 12/1/2016
1 1/2/2017 3/1/2017
2 5/10/2014 12/1/2014
2 7/10/2014 10/1/2014
2 1/1/2015 3/2/2015
;
run;
data temp;
 set have;
 by id;
 if start_date <= lag(end_date) then flag=1;
 if first.id then flag=.;
run;
data temp;
 set temp;
 if missing(flag) then group+1;
run;
proc sql;
create table want as
 select *
  from temp
   group by group
    having sum(flag) > 0;
quit;

SAS Innovate 2025: Save the Date

 SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!

Save the date!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 3 replies
  • 852 views
  • 1 like
  • 4 in conversation