BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
Siva_Harish
Obsidian | Level 7

 

Hi Community,

 

Can anyone help to get only overlapping records.

data have;
input EVENT_ID 6. EVENT_START_DATE DATETIME19. EVENT_END_DATE DATETIME19. ID 2.;
FORMAT EVENT_START_DATE EVENT_END_DATE DATETIME19. ;
datalines;
048110 10Jun2021:13:00:00 10Jun2021:14:00:00 6
049076 10Jun2021:14:00:00 10Jun2021:15:00:00 6
026365 20Sep2019:14:00:00 21Sep2019:18:00:00 9
026361 20Sep2019:16:30:00 20Sep2019:19:00:00 9
037522 25Sep2020:13:00:00 26Sep2020:11:30:00 9
035367 25Sep2020:16:30:00 25Sep2020:20:00:00 9
027322 20Sep2019:13:00:00 20Sep2019:17:00:00 2
026361 20Sep2019:16:30:00 20Sep2019:19:00:00 2
;
run;

 For id 6 wanted to remove those as there is no overlap between first event_start_date and second event_start_date or event_date. For id 2 there is half an hour overlap between two events (event 027322 started at 20Sep2019 13:00:00 and ended at 20Sep2019 17:00:00 and for same id started second event before first event ended(over lap of half an hour) .

 

Want : 

 

EVENT_ID EVENT_START_DATE EVENT_END_DATE ID
026365 20Sep2019 14:00:00 21Sep2019 18:00:00 9
026361 20Sep2019 16:30:00 20Sep2019 19:00:00 9
037522 25Sep2020 13:00:00 26Sep2020 11:30:00 9
035367 25Sep2020 16:30:00 25Sep2020 20:00:00 9
027322 20Sep2019 13:00:00 20Sep2019 17:00:00 2
026361 20Sep2019 16:30:00 20Sep2019 19:00:00 2
1 ACCEPTED SOLUTION

Accepted Solutions
Kurt_Bremser
Super User

A slightly different take, using the basically same "look ahead" technique:

data want;
merge
  have
  have (
    firstobs=2
    keep=id event_start_date
    rename=(id=_id event_start_date=_start)
  )
;
if
  id = lag(id) and event_start_date < lag(event_end_date)
  or
  id = _id and event_end_date > _start
;
drop _:;
run;

@mkeintz if the dataset is sorted by id and event_start_date (or at least sorted by date within a id group), then an overlap in the third (or later) obs would force an overlap in the previous obs.

View solution in original post

4 REPLIES 4
mkeintz
PROC Star

Please show what your desired output dataset should look like.

 

And please provide your available data in a SAS data step format.

 

 

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------
Siva_Harish
Obsidian | Level 7
Hi ,

i had modified my post .Can you help me now?
mkeintz
PROC Star

Thanks for setting up the data for ready code testing.  This program produces what you want:

 

data want (drop=_:);
  set have (keep=id);
  by id notsorted;
  merge have  
        have (firstobs=2 keep=event_start_date rename=(event_start_date=_nxt_evt));

  if last.id=0 and _nxt_evt<event_end_date
        or
     first.id=0 and event_start_date<lag(event_end_date);
run;

The program assumes:

  1. Data are grouped by ID.
  2. Data are sorted by event_start_date within each ID.
  3. Overlaps are examined only for consecutive records.

Number 3 is important because if you had three records, in which the 2nd and 3rd records don't overlap each other, but both overlap or are contained in the first, this code would not include the third record.  Is that condition possible in your data?

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------
Kurt_Bremser
Super User

A slightly different take, using the basically same "look ahead" technique:

data want;
merge
  have
  have (
    firstobs=2
    keep=id event_start_date
    rename=(id=_id event_start_date=_start)
  )
;
if
  id = lag(id) and event_start_date < lag(event_end_date)
  or
  id = _id and event_end_date > _start
;
drop _:;
run;

@mkeintz if the dataset is sorted by id and event_start_date (or at least sorted by date within a id group), then an overlap in the third (or later) obs would force an overlap in the previous obs.

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 4 replies
  • 893 views
  • 0 likes
  • 3 in conversation