Hi,
I have a dataset I want to identify if two events occurred on the same date. I am not sure how to even start.
My data look like this
ID date event1 event2
1 4/5/05 1 1
1 4/5/05 0 1
1 4/6/05 0 0
2 5/1/05 1 1
2 5/2/05 0 1
2 5/2/05 1 0
I would like to add a flag like this
ID date event1 event2 Flag
1 4/5/05 1 1 1
1 4/5/05 0 1 1
1 4/6/05 0 0 0
2 5/1/05 1 1 0
2 5/2/05 0 1 1
2 5/2/05 1 0 1
Thanks much!
data have;
input ID date $ event1 event2;
cards;
1 4/5/05 1 1
1 4/5/05 0 1
1 4/6/05 0 0
2 5/1/05 1 1
2 5/2/05 0 1
2 5/2/05 1 0
;
run;
data want;
do until(last.date);
set have;
by id date;
if event1 then has_event1=1;
if event2 then has_event2=1;
end;
flag=0;
do until(last.date);
set have;
by id date;
if has_event1 and has_event2 then flag=1;
output;
end;
drop has_:;
run;
Post test data in the form of a datastep!!
As such , this is not tested:
proc sql; create table want as select a.*, case when b.id ne . then 1 else 0 end as flag from have a left join have b on a.id=b.id and a.date=b.date and sum(a.event1,a.event2) ne sum(b.event1,b.event2); quit;
So basically I am merging have back to itself where id and date match, but event 1 and 2 don't.
data have;
input ID date $ event1 event2;
cards;
1 4/5/05 1 1
1 4/5/05 0 1
1 4/6/05 0 0
2 5/1/05 1 1
2 5/2/05 0 1
2 5/2/05 1 0
;
run;
data want;
do until(last.date);
set have;
by id date;
if event1 then has_event1=1;
if event2 then has_event2=1;
end;
flag=0;
do until(last.date);
set have;
by id date;
if has_event1 and has_event2 then flag=1;
output;
end;
drop has_:;
run;
Thanks ! This was a great help!
Maybe my esteemed colleagues @RW9 and @Ksharp understand the problem and the meaning of event1 and event2 in the original table, but I do not understand what event1 and event2 are, or why I should care if we are truly looking at dates.
But here is my answer, ignoring the variables event1 and event2, and assuming the data is properly sorted
proc freq data=have;
by id;
tables date/out=a;
run;
data want;
merge have a;
by id date;
if count>1 then flag=1; else flag=0;
run;
Given the data are sorted by id/date, you just need a way to look at the next date (firstobs=2) and at the preceding date (lag(date)). And you need a way to determine whether the record-in-hand is at the begin or end of an id group (SET and BY statements):
data have;
input ID date $ event1 event2;
cards;
1 4/5/05 1 1
1 4/5/05 0 1
1 4/6/05 0 0
2 5/1/05 1 1
2 5/2/05 0 1
2 5/2/05 1 0
;
run;
data want;
set have (keep=id);
by id;
merge have have (firstobs=2 keep=date rename=(date=nxt_date));
if (date=nxt_date and last.id=0) or (date=lag(date) and first.id=0)
then flag=1;
run;
data have;
input ID date $ event1 event2;
cards;
1 4/5/05 1 1
1 4/5/05 0 1
1 4/6/05 0 0
2 5/1/05 1 1
2 5/2/05 0 1
2 5/2/05 1 0
;
run;
proc sql;
create table want as
select *,sum(max(event1 , event2))>1 as flag
from have
group by id, date;
quit;
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.