I have a dataset with the ID and event date for each id. Each id can have many events up to 40 event dates. I want to find id who have at least 2 events that happened within no more than 12 months from each other. In the case below, the new dataset should keep id 1 and 3 because both have at least 2 even dates that are less than 12 months from each other
id event_Date
1 1/1/2005
1 2/21/2007
1 2/22/2007
1 12/5/2004
2 1/1/2004
2 3/3/2007
2 7/5/2006
3 11/2/2004
3 1/5/2005
OK. Assuming I follow your logic .
data have;
input id event_Date : mmddyy10.;
format event_Date mmddyy10.;
cards;
1 1/1/2005
1 2/21/2007
1 2/22/2007
1 12/5/2004
2 1/1/2004
2 3/3/2006
2 7/5/2007
3 11/2/2004
3 1/5/2005
;
run;
proc sql;
create table want as
select * from have where id in (
select distinct a.id
from have as a,have as b
where a.id=b.id and a.event_Date ne b.event_Date and
abs(intck('month',a.event_Date,b.event_Date,'c')) le 12
);
quit;
But as your logic , 2 should be include too.
2 3/3/2007
2 7/5/2006
only have 8 month between them .
Your rule and your wanted result don't match.
id 2 has less than 12 months between July 5, 2006 and March 3, 2007.
OK. Assuming I follow your logic .
data have;
input id event_Date : mmddyy10.;
format event_Date mmddyy10.;
cards;
1 1/1/2005
1 2/21/2007
1 2/22/2007
1 12/5/2004
2 1/1/2004
2 3/3/2006
2 7/5/2007
3 11/2/2004
3 1/5/2005
;
run;
proc sql;
create table want as
select * from have where id in (
select distinct a.id
from have as a,have as b
where a.id=b.id and a.event_Date ne b.event_Date and
abs(intck('month',a.event_Date,b.event_Date,'c')) le 12
);
quit;
If the aforementioned discrepancy was just a typo, this might be a solution:
data have;
input id event_date :mmddyy10.;
format event_date mmddyy10.;
cards;
1 1/1/2005
1 2/21/2007
1 2/22/2007
1 12/5/2004
2 1/1/2004
2 3/3/2007
2 7/5/2005
3 11/2/2004
3 1/5/2005
;
run;
proc sort data=have;
by id event_date;
run;
data want (keep=id);
set have;
by id;
retain flag;
prev_date = lag(event_date);
if first.id then flag = 0;
else do;
if intck('month',prev_date,event_date,'cont') <= 12 then flag = 1;
end;
if last.id and flag then output;
run;
For the sake of safety and speed, I suggest you to use @Kurt_Bremser 's code ,
if there are two same event_date for the same ID , My code wouldn't be right.
Available on demand!
Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.