I've interpreted the question as wanting to keep all observation for an id if that id meets the condition of 2 or more occurrences of 411 or 430 within variables dx1-dx5 at anytime within any 12-month rolling period. The flagged variable is used so that only one row for each id meeting the conditions output to the dataset 'idstokeep'. I've learned from this that the way the lag function behaves is affected by conditional expressions, and there has been some trial and error in getting to this. Finally, a caveat. This may not work if a single observation has 2 occurrences of 411 or 430, or a value of 411 and 430. (eg DX1=411 & DX2=411, or DX1=411 & DX2=430). If that is a real possibility then will need to make some changes. data have; input id date : mmddyy10. dx1 dx2 dx3 dx4 dx5; format date mmddyy10. ; * ids 1, 3, 4 have had repeat readings of 411 or 430 in a 1-year period , so all observations for these ids will be kept; datalines; 1 1/1/2005 411 412 413 414 415 1 2/2/2005 411 412 413 414 415 2 11/1/2006 411 412 413 418 419 2 3/3/2007 420 421 422 423 424 3 9/4/2006 430 440 450 460 470 3 6/4/2006 430 510 520 530 560 4 3/4/2006 430 440 450 460 470 4 6/12/2006 432 510 520 530 560 4 9/14/2006 430 510 520 530 560 5 3/4/2006 430 440 450 460 470 5 6/12/2006 432 510 520 530 560 5 4/14/2007 430 510 520 530 560 ; run; proc sort data=have; by id date; proc transpose data=have out=trans(where=(col1 in (411,430))); by id date; run; data idstokeep (keep=id); set trans; retain flagged; by id date; if first.id then flagged=0 ; if not first.id and INTCK('YEAR',lag(date),date,'C') =0 and flagged=0 then do; flagged=1; output; end; run; data want; merge idstokeep (in=idwanted) have; by id; if idwanted; run;
... View more