Solved: Identify and isolate gaps in between date ranges

JDWall · Posted 02-07-2019 01:05 AM

I'm working on identifying events that happened in date ranges that aren't specifically covered by the data. To do this, I have a list of date ranges, some sequential, some with gaps. What I'm tyring to get is a list of date ranges that span the gaps. For example:

Cat1	Cat2	EffDt	TermDt
A	1	8/8/2016	10/16/2016
A	1	11/1/2016	1/10/2017
A	1	1/11/2017	4/23/2017
A	1	4/24/2017	7/19/2017
A	1	7/20/2017	9/30/2017
A	1	10/1/2017	3/31/2018
A	2	10/1/2017	4/29/2018
A	2	10/1/2017	7/1/2018

In this case, A1 is covered from 8/8/2016 through 10/16/2016, followed by a gap from 10/17/2016 through 10/31/2016, and coevered again from 11/1/2016 through 3/31/2018. What I have been trying to get out of this is the 10/17/2016 amd 10/31/2016 values. In the event that there were multiple gaps, I'd want to identify each of them. I've been trying a few different loop permutations, but haven't been able to get anything to work quite right.

Any help would be greatly appreciated.

Thanks!

Ksharp · Posted 02-07-2019 10:49 AM

data have;
input Cat1 $ Cat2 $ (EffDt TermDt)(:mmddyy10.);
format EffDt TermDt mmddyy10.;
datalines;
A 1 8/8/2016 10/16/2016
A 1 11/1/2016 1/10/2017
A 1 1/11/2017 4/23/2017
A 1 4/24/2017 7/19/2017
A 1 7/20/2017 9/30/2017
A 1 10/1/2017 3/31/2018
A 2 10/1/2017 4/29/2018
A 2 10/1/2017 7/1/2018
;

data want;
 set have;
 by cat1 cat2;
 lag=lag(TermDt);
 d=EffDt-lag;
 if d>1 and not first.cat2 then do;
   TermDt=EffDt-1;EffDt=lag+1;output;
 end;
 drop d lag;
run;

Assuming data is ordered.

View solution in original post

PeterClemmensen · Posted 02-07-2019 01:40 AM

HI and welcome to the SAS Communities 🙂

It is not clear to me if you want both Cat1 and Cat2 to constitute a group here. However, i think this gives you what you want. It inserts a new line with the 'date gap' whenever a date gap is detected. Let me know if it works for you.

data have;
input Cat1 $ Cat2 $ (EffDt TermDt)(:mmddyy10.);
format EffDt TermDt mmddyy10.;
datalines;
A 1 8/8/2016 10/16/2016
A 1 11/1/2016 1/10/2017
A 1 1/11/2017 4/23/2017
A 1 4/24/2017 7/19/2017
A 1 7/20/2017 9/30/2017
A 1 10/1/2017 3/31/2018
A 2 10/1/2017 4/29/2018
A 2 10/1/2017 7/1/2018
;

data want(drop= lagTermDt _:);
   set have;
   by Cat2;

   lagTermDt=lag1(TermDt);
   if first.Cat2 then lagTermDt=.;

   if intck('day', lagTermDt, EffDt)>1 & lagTermDt ne . then do;
      _EffDt=EffDt; _TermDt=TermDt;
      EffDt=lagTermDt+1;
      TermDt=_EffDt-1;
      output;

      EffDt=_EffDt; TermDt=_TermDt;
      output;return;
   end;

   output;
run;

The DATA to DATA Step Macro
Blog: SASnrd

ScottBass · Posted 02-07-2019 01:49 AM

Please post your code using a complete data step and datalines statement so we don't have to work so hard to help you out.

As to your problem, and only considering the pattern in the data you posted:

Use a surrogate key (record pointer) and SQL's Cartesian product to "look ahead" to the next record.

data have;
   length cat1 $1 cat2 8 start end 8;
   format start end date9.;
   informat start end mmddyy10.;
   input cat1 cat2 start end;
   datalines;
A   1   8/8/2016   10/16/2016
A   1   11/1/2016   1/10/2017
A   1   1/11/2017   4/23/2017
A   1   4/24/2017   7/19/2017
A   1   7/20/2017   9/30/2017
A   1   10/1/2017   3/31/2018
A   2   10/1/2017   4/29/2018
A   2   10/1/2017   7/1/2018
;
run;

* create surrogate key ;
data sk / view=sk;
   sk+1;
   set have;
run;

proc sql;
   * for illustration/debugging ;
   select
       a.*
      ,b.*
   from
      sk a
   join
      sk b
   on
      a.cat1=b.cat1
      and
      a.cat2=b.cat2
      and 
      a.sk=b.sk-1
   ;

   create table want as
   select
       a.cat1
      ,a.cat2
      ,a.start as old_start
      ,a.end   as old_end
      ,b.start as new_start
      ,b.end   as new_end
      ,intck('day',old_end,new_start) as gap
   from
      sk a
   join
      sk b
   on
      a.cat1=b.cat1
      and
      a.cat2=b.cat2
      and 
      a.sk=b.sk-1
   where
      intck('day',old_end,new_start) > 1
   ;
quit;

You can tidy this up a bit, dropping old_start and new_end if you wish.

This doesn't cover say overlapping dates, where the new start date < old end date. Again, see bold type above.

Is there a typo in EffDt in your last dataline?

Please post your question as a self-contained data step in the form of "have" (source) and "want" (desired results).
I won't contribute to your post if I can't cut-and-paste your syntactically correct code into SAS.

s_lassen · Posted 02-07-2019 03:43 AM

If you want to find events in another dataset that are not covered by date ranges, you can do that directly, something like

Proc sql;
  create table uncovered as select * from events
  where not exists(select * from date_ranges
      where cat1=events.cat1
        and cat2=events.cat2
        and events.date>=EffDt 
        and events.date<=TermDt);

I used the two last conditions instead of just using a BETWEEN condition because SAS SQL is peculiar in its interpretation of BETWEEN - "X between A and B" in SAS SQL can be true also in cases where A>B, such as your last observation, then one which @ScottBass suggested could be a typo.

Ksharp · Posted 02-07-2019 10:49 AM

data have;
input Cat1 $ Cat2 $ (EffDt TermDt)(:mmddyy10.);
format EffDt TermDt mmddyy10.;
datalines;
A 1 8/8/2016 10/16/2016
A 1 11/1/2016 1/10/2017
A 1 1/11/2017 4/23/2017
A 1 4/24/2017 7/19/2017
A 1 7/20/2017 9/30/2017
A 1 10/1/2017 3/31/2018
A 2 10/1/2017 4/29/2018
A 2 10/1/2017 7/1/2018
;

data want;
 set have;
 by cat1 cat2;
 lag=lag(TermDt);
 d=EffDt-lag;
 if d>1 and not first.cat2 then do;
   TermDt=EffDt-1;EffDt=lag+1;output;
 end;
 drop d lag;
run;

Assuming data is ordered.

Identify and isolate gaps in between date ranges

Re: Identify and isolate gaps in between date ranges

Re: Identify and isolate gaps in between date ranges

Re: Identify and isolate gaps in between date ranges

Re: Identify and isolate gaps in between date ranges

Re: Identify and isolate gaps in between date ranges

Identify and isolate gaps in between date ranges

Re: Identify and isolate gaps in between date ranges

Re: Identify and isolate gaps in between date ranges

Re: Identify and isolate gaps in between date ranges

Re: Identify and isolate gaps in between date ranges

Re: Identify and isolate gaps in between date ranges

The 2025 SAS Hackathon has begun!

SAS Training: Just a Click Away