This is a good task to wax a little didactic about using the queue-based nature of the lag function:
data want;
set have (in=firstpass)
have (in=secondpass);
by id;
if firstpass then sum_one_to_zero + (lag(flag)=1 and flag=0);
if first.id then sum_one_to_zero=0;
if secondpass;
if lag(flag)=1 and flag=0 then one_to_zero=1;
if lag(id)^=id then one_to_zero=.;
run;
The statement:
if firstpass then sum_one_to_zero + (lag(flag)=1 and flag=0);
compares the current flag to the preceding flag, building a total of transitions from 1 to 0. Because the lag function is in the then clause, it is applied only for firstpass cases - secondpass cases never impact the queue underlying the lag function.
To avoid results from the preceding id contaminating the current id, the sum is reset to zero at the start of each ID.
if first.id then sum_one_to_zero=0;
The third use of the lag function is more subtle. The statement
if lag(id)^=id then one_to_zero=.;
appear to test whether the record-in-hand is the start of an id. So why not just use
if first.id then one_to_zero=.; /*Do not use this for secondpass*/
Because this part of the program only deals with second_pass observations, while the first.id condition only exists for firstpass observations. So you basically have to realize that this part of the program is only processing groups of secondpass observations.
... View more