You can make an EVENT file from work.new. Each EVENT record could have these variables:
CUSIP
EVENT_NUMBER (a running count of events within each CUSIP)
EVENT_PRE_YEAR (the year preceding a sequence of at least two years with BUILT1=1)
EVENT_POST_YEAR (the year following at least two year with BUILT1=1 preceded by a BUILT1=.).
Here's a code that will do so:
data events (keep=cusip event_number pre_event_year post_event_year);
array hist{1987:2010} ;
/* Build an historical array of BUILT1 values */
do until (last.cusip);
set work.new;
by cusip notsorted;
if first.cusip then cusip_start=fyear;
if built1=. then built1=0;
hist{fyear}=built1;
end;
cusip_end=fyear;
pre_event_year=.;
/* Now scan the array, looking for sequences of 0 */
/* followed by two or more 1's followed by a 0 */
do yr=cusip_start to cusip_end;
if hist{yr}=0 then do; /* YR is a potential event-span start or end */
if pre_event_year^=. and hist{yr-1}=1 and hist{yr-2}=1 then post_event_year=yr;
else pre_event_year=yr;
end;
if n(pre_event_year,post_event_year)=2 then do;
event_number=sum(event_number,1);
output;
call missing(pre_event_year,post_event_year);
end;
end;
run;
proc freq data=events;
tables event_number;
run;
Notes:
The proc freq will report both the total number of events and total number of firms having at least one event. The frequency of EVENT_NUMBER=1 is the number of firms, and the total of all values of EVENT_NUMBER is the number of events.
No year is counted twice, I.e.,, while a sequence of built1= ., 1, 1, ., ., 1, 1, . is two events (because there are two years with BUILT1=. in the middle), the sequence ., 1, 1, ., 1, 1, . is a SINGLE event.
... View more