Because you know the time domain of possible years (2000:2018) you can use an array indexed by year, with each element of the array being the number of consecutive years up to and including the element. After reading all the obs for an id, find the maximum size in the array, determine its element as the end year of the maximum size, and calculate the corresponding beginning year.
The reread all the years for the same id, keeping only those between maxsizbeg and maxsizend:
data have;
informat ID 1. year 4.;
input ID year;
cards;
1 2000
1 2001
1 2002
1 2004
1 2005
1 2006
1 2007
1 2008
1 2010
1 2011
2 2000
2 2001
2 2002
2 2003
2 2004
2 2005
2 2006
2 2007
2 2008
2 2010
2 2017
3 2001
3 2002
3 2003
3 2016
3 2017
3 2018
;
data want;
array siz {1999:2018} _temporary_;
set have;
by id;
if first.id then call missing(of siz{*});
siz{year} = sum(siz{year-1},1);
if last.id;
maxsiz=max(of siz{*});
maxsizbeg = lbound(siz) + whichn(maxsiz,of siz{*}) - maxsiz ;
maxsizend = maxsizbeg + maxsiz - 1;
do until (last.id); /* Reread and filter this id*/
set have;
by id;
if maxsizbeg<=year<=maxsizend then output;
end;
run;
The "trick" here is to create an array with a lower bound of one year prior to your earliest data (i.e. 1999) and an upper bound of the last year in your data (2018). (You could have an even smaller lower-bound and higher upper bound with no harm).
The statement :
siz{year} = sum(siz{year-1},1);
assigns a size value for the current year equal to one greater than the prior year's size value. But if the prior year is never encountered, then its size value is missing. Since the sum function of 1 plus missing is 1, it means the current size is 1 - i.e. the start of a new time span.
At the end of an id, get the maximum size, find out where it is in the array [whichn(maxsize,of siz{*})], then determine the corresponding maxsizbeg year and maxsizend year.
... View more