Hi,
I have a dataset I want to identify if two events occurred on the same date. I am not sure how to even start.
My data look like this
ID date event1 event2
1 4/5/05 1 1
1 4/5/05 0 1
1 4/6/05 0 0
2 5/1/05 1 1
2 5/2/05 0 1
2 5/2/05 1 0
I would like to add a flag like this
ID date event1 event2 Flag
1 4/5/05 1 1 1
1 4/5/05 0 1 1
1 4/6/05 0 0 0
2 5/1/05 1 1 0
2 5/2/05 0 1 1
2 5/2/05 1 0 1
Thanks much!
data have;
input ID date $ event1 event2;
cards;
1 4/5/05 1 1
1 4/5/05 0 1
1 4/6/05 0 0
2 5/1/05 1 1
2 5/2/05 0 1
2 5/2/05 1 0
;
run;
data want;
do until(last.date);
set have;
by id date;
if event1 then has_event1=1;
if event2 then has_event2=1;
end;
flag=0;
do until(last.date);
set have;
by id date;
if has_event1 and has_event2 then flag=1;
output;
end;
drop has_:;
run;
Post test data in the form of a datastep!!
As such , this is not tested:
proc sql; create table want as select a.*, case when b.id ne . then 1 else 0 end as flag from have a left join have b on a.id=b.id and a.date=b.date and sum(a.event1,a.event2) ne sum(b.event1,b.event2); quit;
So basically I am merging have back to itself where id and date match, but event 1 and 2 don't.
data have;
input ID date $ event1 event2;
cards;
1 4/5/05 1 1
1 4/5/05 0 1
1 4/6/05 0 0
2 5/1/05 1 1
2 5/2/05 0 1
2 5/2/05 1 0
;
run;
data want;
do until(last.date);
set have;
by id date;
if event1 then has_event1=1;
if event2 then has_event2=1;
end;
flag=0;
do until(last.date);
set have;
by id date;
if has_event1 and has_event2 then flag=1;
output;
end;
drop has_:;
run;
Thanks ! This was a great help!
Maybe my esteemed colleagues @RW9 and @Ksharp understand the problem and the meaning of event1 and event2 in the original table, but I do not understand what event1 and event2 are, or why I should care if we are truly looking at dates.
But here is my answer, ignoring the variables event1 and event2, and assuming the data is properly sorted
proc freq data=have;
by id;
tables date/out=a;
run;
data want;
merge have a;
by id date;
if count>1 then flag=1; else flag=0;
run;
Given the data are sorted by id/date, you just need a way to look at the next date (firstobs=2) and at the preceding date (lag(date)). And you need a way to determine whether the record-in-hand is at the begin or end of an id group (SET and BY statements):
data have;
input ID date $ event1 event2;
cards;
1 4/5/05 1 1
1 4/5/05 0 1
1 4/6/05 0 0
2 5/1/05 1 1
2 5/2/05 0 1
2 5/2/05 1 0
;
run;
data want;
set have (keep=id);
by id;
merge have have (firstobs=2 keep=date rename=(date=nxt_date));
if (date=nxt_date and last.id=0) or (date=lag(date) and first.id=0)
then flag=1;
run;
data have;
input ID date $ event1 event2;
cards;
1 4/5/05 1 1
1 4/5/05 0 1
1 4/6/05 0 0
2 5/1/05 1 1
2 5/2/05 0 1
2 5/2/05 1 0
;
run;
proc sql;
create table want as
select *,sum(max(event1 , event2))>1 as flag
from have
group by id, date;
quit;
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.