The begining of the program has
set have;
by policyno start_date end_date;
if first.end_date;
The BY statement tells SAS to (1) expect the data to be pre-sorted by policyno start_date end_date, and (2) set automatic dummies for each of the by variables indicaiing whether the record in hand is the first record for a given by value (first.policyno, first.start_date, first.end_date) or the last (last.policyno etc.). And what happens to the first. dummies when policyno changes but the dates don't? The treatment of by values is hierarchical, so if a given by var changes setting its first. dummy, all lower order (to the right) first. dummies are also set - no matter what their value sequence is.
The third statement is a subsetting IF (look up sas "subsetting if") which tells sas to keep only records which are the first instance of a given policyno/start_date/end_date combination. As a result the DO group never processes duplicates.
If you want an explanation of how the DO group works, take a multiyear policy, run the do group. But inside the do group place a set of PUT statements to see what is happening to N, start_date, end_date, and date_dif:
e_date=end_date;
put 'A: ' e_date=yymmddn8.;
do N=1 to 10 while (start_date < e_date);
put / 'B: ' N= start_date= end_date=; end_date=min(e_date,intnx('year',start_date,1,'s')-1);
put 'C: ' end_date=; length date_dif $20; date_dif='One Year'; if end_date<intnx('year',start_date,1,'s')-1 then date_dif=catx(' ',intck('month',start_date,end_date+1,'continuous'),'months');
put 'D: ' date_dif=; output; start_date=end_date+1;
put 'E: ' start_date=; end;
... View more