## Determine coverage periods for data with gaps and overlaps

Solved
Regular Contributor
Posts: 247

# Determine coverage periods for data with gaps and overlaps

[ Edited ]

Hi...I have a dataset that contains the records for the application for product coverage which is given for a limited time period. I would like to obtain the overall coverage for each ID by Product number. If the coverage would have been overlapping for the same product number for each ID, it would have been easy by grouping by ID and Product Number and using the minimum of the Effective Dates and maximumof the Expiry Dates. However, it is possible to have a gap between coverages and this is the part that I am having problems with. Any suggestions....Thanks.

Have:

 ID Product Number Effective Date Expiry Date 2303 11232 20160102 20160227 2303 11232 20160212 20160325 2303 14356 20160212 20160325 2303 14356 20160325 20160420 2409 17809 20160113 20160217 2409 17809 20160220 20160326 2409 17809 20160301 20160320 2409 17809 20160324 20160430 2409 18488 20160119 20160219 2409 18488 20160218 20160318 2512 15932 20160123 20160227 2512 15932 20160227 20160329 2512 16554 20160124 20160213 2512 16554 20160214 20160315 2512 16554 20160314 20160417 2512 16554 20160420 20160521 2512 16554 20160519 20160620

Want:

 ID Product Number Effective Date Expiry Date 2303 11232 20160102 20160325 2303 14356 20160212 20160420 2409 17809 20160113 20160217 2409 17809 20160220 20160430 2409 18488 20160119 20160318 2512 15932 20160123 20160329 2512 16554 20160124 20160213 2512 16554 20160214 20160417 2512 16554 20160420 20160620

Accepted Solutions
Solution
‎06-21-2016 05:47 PM
Posts: 5,540

## Re: Determine coverage periods for data with gaps and overlaps

This should do the trick:

proc sort data=prod;
by ID ProductNumber EffectiveDate descending ExpiryDate;
run;

data want;
do until(last.productNumber);
set prod; by id productNumber;
format effD expD yymmdd10.;
if expD < effectiveDate then
if not missing(expD) then do;
output;
call missing(effD, expD);
end;
effD = min(effD, effectiveDate);
expD = max(expD, expiryDate);
end;
output;
drop effectiveDate expiryDate;
rename effD=effectiveDate expD=expiryDate;
run;

proc print data=want noobs; run;
PG

All Replies
PROC Star
Posts: 2,370

## Re: Determine coverage periods for data with gaps and overlaps

[ Edited ]

Like this? (edited: more robust on case of nested date intervals).

data HAVE;
input ID \$ Product_Number \$ Effective_Date 8. Expiry_Date : 8.;
cards;
2303 11232 20160102 20160227
2303 11232 20160212 20160325
2303 14356 20160212 20160325
2303 14356 20160325 20160420
2409 17809 20160113 20160217
2409 17809 20160220 20160326
2409 17809 20160301 20160320
2409 17809 20160324 20160430
2409 18488 20160119 20160219
2409 18488 20160218 20160318
2512 15932 20160123 20160227
2512 15932 20160227 20160329
2512 16554 20160124 20160213
2512 16554 20160214 20160315
2512 16554 20160314 20160417
2512 16554 20160420 20160521
2512 16554 20160519 20160620
2513 16555 20160314 20160417
2513 16555 20160415 20160416
2513 16555 20160522 20160620
run;
data WANT;
set HAVE (rename=(Effective_Date=D1 Expiry_Date=D2));
by ID Product_Number D1;
retain KEEP_D1 KEEP_D2 ;
if KEEP_D1=. then do;            %* date interval not set;
KEEP_D1=D1;                    %* start new date interval;
KEEP_D2=D2;
end;
if D1 > KEEP_D2 > . then do;     %* same product and gap between dates;
Effective_Date = KEEP_D1;      %* save old dates to table;
Expiry_Date    = KEEP_D2;
output;
KEEP_D1 = D1 ;                 %* start new date interval with new dates;
KEEP_D2 = D2 ;
end;
KEEP_D2=max(KEEP_D2,D2);         %* update end date of interval if greater ;
if last.Product_Number then do;  %* last record for product;
Effective_Date = KEEP_D1;      %* save dates to table;
Expiry_Date    = KEEP_D2;
output;
KEEP_D1=.;                     %* reset date interval;
end;
run;

Super User
Posts: 10,787

## Re: Determine coverage periods for data with gaps and overlaps

[ Edited ]

There are many scenario  you need to consider about .

data have;
infile datalines expandtabs truncover;
input ID	ProductNumber	(EffectiveDate	ExpiryDate) (: yymmdd10.);
format EffectiveDate	ExpiryDate yymmdd10.;
datalines;
2303	11232	20160102 20160227
2303	11232	20160212 20160325
2303	14356	20160212	20160325
2303	14356	20160325	20160420
2409	17809	20160113	20160217
2409	17809	20160220	20160326
2409	17809	20160301	20160320
2409	17809	20160324	20160430
2409	18488	20160119	20160219
2409	18488	20160218	20160318
2512	15932	20160123	20160227
2512	15932	20160227	20160329
2512	16554	20160124	20160213
2512	16554	20160214	20160315
2512	16554	20160314	20160417
2512	16554	20160420	20160521
2512	16554	20160519	20160620
;
run;
data temp;
set have;
n+1;
date=EffectiveDate;output;
date=ExpiryDate;output;
format date yymmdd10.;
drop EffectiveDate	ExpiryDate;
run;
data temp;
set temp;
by ID	ProductNumber n;
if first.ProductNumber or (first.n and date gt lag(date)+1) then group+1;
run;
data want;
do until(last.group);
set temp;
by group;
if first.group then EffectiveDate=date;
end;
ExpiryDate=date;
drop date n group;
format EffectiveDate ExpiryDate yymmdd10.;
run;
Solution
‎06-21-2016 05:47 PM
Posts: 5,540

## Re: Determine coverage periods for data with gaps and overlaps

This should do the trick:

proc sort data=prod;
by ID ProductNumber EffectiveDate descending ExpiryDate;
run;

data want;
do until(last.productNumber);
set prod; by id productNumber;
format effD expD yymmdd10.;
if expD < effectiveDate then
if not missing(expD) then do;
output;
call missing(effD, expD);
end;
effD = min(effD, effectiveDate);
expD = max(expD, expiryDate);
end;
output;
drop effectiveDate expiryDate;
rename effD=effectiveDate expD=expiryDate;
run;

proc print data=want noobs; run;
PG
🔒 This topic is solved and locked.