I have 2 datasets - one demo (1 row per ID) and one specific event (multiple row per ID). What I would like to do is merge these two tables but have yes_no from the specific event table for one that matches. If there are two or more event in specific table, it should show 1 and 0 for the Ids that don't match
demo
ID | male | race | Location |
1 | 1 | White | US |
2 | 0 | Asian | Japan |
3 | 1 | Unknown | Spain |
4 | 1 | Unknown | Japan |
event
ID | date | term | grade |
1 | 1/2/2010 | Fever | 1 |
1 | 4/2/2020 | Rash | 2 |
2 | 1/2/2010 | Fever | 1 |
4 | 4/1/2011 | Cold | 1 |
want:
ID | male | race | Location | event_yes |
1 | 1 | White | US | 1 |
2 | 0 | Asian | Japan | 1 |
3 | 1 | Unknown | Spain | 0 |
4 | 1 | Unknown | Japan | 1 |
I tried:
proc sql;
select a.*, count(b.ID) as yes_no
from demo a
left join event b on a.ID = b.ID;
quit;
it works but it shows me sum of all events per ID
data demo;
input id male race $ Location $;
datalines;
1 1 White US
2 0 Asian Japan
3 1 Unknown Spain
4 1 Unknown Japan
;
data event;
input ID date :mmddyy10. term $ grade;
format date mmddyy10.;
datalines;
1 1/2/2010 Fever 1
1 4/2/2020 Rash 2
2 1/2/2010 Fever 1
4 4/1/2011 Cold 1
;
data want;
merge demo event(in = e keep = id);
by id;
event_yes = e;
run;
Result:
id male race Location event_yes 1 1 White US 1 1 1 White US 1 2 0 Asian Japan 1 3 1 Unknown Spain 0 4 1 Unknown Japan 1
data event;
input ID date :mmddyy10. term $ grade;
format date mmddyy10.;
datalines;
1 1/2/2010 Fever 1
1 4/2/2020 Rash 2
2 1/2/2010 Fever 1
4 4/1/2011 Cold 1
;
proc sql;
create table want as
select distinct a.*,a.id=b.id as event_yes
from demo a left join event b
on a.id=b.id;
quit;
/*Keep term and grade*/
proc sql;
create table want as
select a.*,term, grade, a.id=b.id as event_yes
from demo a left join event b
on a.id=b.id;
quit;
Notes:
1. a.id=b.id as event_yes /*resolves to a boolean expression 1 or 0 for matches & non matches*/
2. You could choose to keep term and grade and that helps to debug later downstream. Anyways, up to you
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.