You could just use normal code for collapsing to continuous periods.
So perhaps something like this:
data have;
input ID $ (start end) (:yymmdd.);
format start end yymmdd10.;
cards;
A 2015-01-01 2015-12-31
A 2016-01-01 2016-12-31
A 2017-01-03 2017-12-31
B 2020-01-01 2020-12-31
B 2022-01-01 2022-12-31
;
data want;
set have ;
by id start end;
if first.id then do;
period=0;
lagend = start-2;
end;
retain period lagend min max;
format lagend min max yymmdd10.;
if 1<(start-lagend) then do;
if period then output;
period+1;
min=start;
max=end;
end;
max=max(max,end);
if last.id then output;
lagend=end;
drop start end lagend;
run;
Result
OBS ID period min max 1 A 1 2015-01-01 2016-12-31 2 A 2 2017-01-03 2017-12-31 3 B 1 2020-01-01 2020-12-31 4 B 2 2022-01-01 2022-12-31
And just modify it to only OUTPUT when PERIOD is 1.
data want;
set have ;
by id start end;
if first.id then do;
period=0;
lagend = start-2;
end;
retain period lagend min max;
format lagend min max yymmdd10.;
if 1<(start-lagend) then do;
if period=1 then output;
period+1;
min=start;
max=end;
end;
max=max(max,end);
if last.id and period=1 then output;
lagend=end;
drop start end lagend;
run;
Result
OBS ID period min max 1 A 1 2015-01-01 2016-12-31 2 B 1 2020-01-01 2020-12-31
What if you have more than one obs at earliest start or latest end?
And also, your last obs has start of 2022-01-01 and end of 2020-12-31. Doesn't make sense. In fact, that end is an actual tie between the last two obs.
Finally, is your data sorted by start? Your sample is.
You could just use normal code for collapsing to continuous periods.
So perhaps something like this:
data have;
input ID $ (start end) (:yymmdd.);
format start end yymmdd10.;
cards;
A 2015-01-01 2015-12-31
A 2016-01-01 2016-12-31
A 2017-01-03 2017-12-31
B 2020-01-01 2020-12-31
B 2022-01-01 2022-12-31
;
data want;
set have ;
by id start end;
if first.id then do;
period=0;
lagend = start-2;
end;
retain period lagend min max;
format lagend min max yymmdd10.;
if 1<(start-lagend) then do;
if period then output;
period+1;
min=start;
max=end;
end;
max=max(max,end);
if last.id then output;
lagend=end;
drop start end lagend;
run;
Result
OBS ID period min max 1 A 1 2015-01-01 2016-12-31 2 A 2 2017-01-03 2017-12-31 3 B 1 2020-01-01 2020-12-31 4 B 2 2022-01-01 2022-12-31
And just modify it to only OUTPUT when PERIOD is 1.
data want;
set have ;
by id start end;
if first.id then do;
period=0;
lagend = start-2;
end;
retain period lagend min max;
format lagend min max yymmdd10.;
if 1<(start-lagend) then do;
if period=1 then output;
period+1;
min=start;
max=end;
end;
max=max(max,end);
if last.id and period=1 then output;
lagend=end;
drop start end lagend;
run;
Result
OBS ID period min max 1 A 1 2015-01-01 2016-12-31 2 B 1 2020-01-01 2020-12-31
data have;
input ID $ Startdate :yymmdd12. Enddate :yymmdd12.;
format Startdate Enddate :yymmdd10.;
cards;
A 2015-01-01 2015-12-31
A 2016-01-01 2016-12-31
A 2017-01-03 2017-12-31
B 2020-01-01 2020-12-31
B 2022-01-01 2020-12-31
;
data temp;
set have;
do date=Startdate to Enddate;
output;
end;
format date yymmdd10.;
keep id date;
run;
proc sort data=temp out=temp2 nodupkey;
by id date;
run;
data temp3;
set temp2;
by id;
if first.id then group=0;
if first.id or dif(date) ne 1 then group+1;
if group=1;
run;
proc summary data=temp3;
by id;
var date;
output out=want(drop=_:) min=Startdate max=Enddate;
run;
Join us for SAS Innovate 2025, our biggest and most exciting global event of the year, in Orlando, FL, from May 6-9.
Lock in the best rate now before the price increases on April 1.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.