@djeon
If you follow @Reeza advice and post in the future sample data as a working data step then you've got also a much better chance that someone is motivated to provide some code to you.
I've made in below code the assumption that if you have a case with multiple records which are less than 42 days apart from the previous one, you still would want to keep the records with more than 42 days difference - i.e. if record 1,2 and 3 have less than 42 days difference to the previous one BUT records 1 and 3 have a difference greater 42 days then keep record 1 AND 3.
data have;
infile cards dlm=',' dsd;
informat a $10. b yymmdd10. c d e f g $10.;
format b date9.;
input a $ b c d e f g;
cards;
SL00106714, 2013-08-26, TOE, R, S, S, S
SL00106714, 2013-11-12, TOE, R, S, S, S
SL00106723, 2013-04-17, NARE, , , ,
SL00106723, 2013-04-17, NARE, , , ,
SL00106739, 2013-04-14, THIGH, R, S, S, S
SL00106781, 2010-05-04, SKIN, S, S, S,
SL00106781, 2010-05-04, TOE, S, S, S,
SL00106781, 2012-11-25, CHIN, R, R, S, S
SL00106781, 2013-01-07, BLOOD, R, R, S,
SL00106781, 2013-01-07, BLOOD, R, R, S,
SL00106781, 2013-02-01, BLOOD, R, R, S, S
SL00106781, 2013-05-22, ELBOW, R, R, S, S
SL00106781, 2013-07-26, ELBOW, R, R, S, S
;
run;
proc sort data=have out=inter;
by a c d e f g b;
run;
data want;
set inter;
by a c d e f g b;
format _lag_b _last_kept_b date9.;
retain _last_kept_b;
_lag_b=lag(b);
if first.g then
do;
_last_kept_b=b;
end;
else
do;
if missing(_last_kept_b) then _last_kept_b=b;
if 0<=(b - _lag_b)<=42 and 0<=(b - _last_kept_b)<=42 then delete;
else _last_kept_b=b;
end;
run;
N.B: The LAG() and DIF() function must execute in every single iteration of a data step so never use them within a conditional code block. That's the reason I couldn't use DIF() in my code logic.
... View more