BookmarkSubscribeRSS Feed
SarahW13
Obsidian | Level 7

Hello all!

 

I have a dataset in which I would like to be able to find time periods that overlap for each id.

 

Below is an example of the start dataset:

 

id      start_date       end_date

1       3/1/2016        6/2/2016

1       4/1/2016        7/1/2016

1       8/2/2016        12/1/2016

1       1/2/2017        3/1/2017

2       5/10/2014      12/1/2014

2       7/10/2014      10/1/2014

2       1/1/2015        3/2/2015

 

I want to be able to identify the rows for each id with the overlapping periods, like below:

 

id      start_date       end_date

1       3/1/2016        6/2/2016

1       4/1/2016        7/1/2016

 

2       5/10/2014      12/1/2014

2       7/10/2014      10/1/2014

 

Can anyone guide me on this?

Thank you!

3 REPLIES 3
pau13rown
Lapis Lazuli | Level 10

there is discussion of overlap here: http://www2.sas.com/proceedings/sugi31/048-31.pdf

they use arrays

Patrick
Opal | Level 21

A SQL self-join makes this sort of problem often quite simple.

data have;
  infile datalines dlm=' ' truncover;
  input id start_date:mmddyy. end_date:mmddyy.;
  format start_date end_date date9.;
  datalines;
1 3/1/2016 6/2/2016
1 4/1/2016 7/1/2016
1 8/2/2016 12/1/2016
1 1/2/2017 3/1/2017
2 5/10/2014 12/1/2014
2 7/10/2014 10/1/2014
2 1/1/2015 3/2/2015
;
run;

proc sql;
create table want as select l.* from have l, have r where /* test only within same id */ l.id=r.id /* do not join a record with itself */ and not (l.start_date=r.start_date and l.end_date=r.end_date) /* find overlapping dates */ and ( l.start_date between r.start_date and r.end_date or r.start_date between l.start_date and l.end_date ) ; quit;
Ksharp
Super User
data have;
  infile datalines dlm=' ' truncover;
  input id start_date:mmddyy. end_date:mmddyy.;
  format start_date end_date date9.;
  datalines;
1 3/1/2016 6/2/2016
1 4/1/2016 7/1/2016
1 8/2/2016 12/1/2016
1 1/2/2017 3/1/2017
2 5/10/2014 12/1/2014
2 7/10/2014 10/1/2014
2 1/1/2015 3/2/2015
;
run;
data temp;
 set have;
 by id;
 if start_date <= lag(end_date) then flag=1;
 if first.id then flag=.;
run;
data temp;
 set temp;
 if missing(flag) then group+1;
run;
proc sql;
create table want as
 select *
  from temp
   group by group
    having sum(flag) > 0;
quit;

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 3 replies
  • 710 views
  • 1 like
  • 4 in conversation