Hello,
I am attempting to join 2 datasets through at least one variable, in a hierarchy. For example, join on first name, or join on surname, in that order. However this looks like it works, but not perfectly. It creates duplicates in my first dataset by joining to different records in the second dataset. Is there a way to perform a join and making it stop once a match is found. The below example is not exactly the code. There are actually around 44 joins I need it to run through. I know that it isn't doing both joins separately, because if it were I would get over 2000 additional duplicates (I'm currently getting 52), but ideally I would like no duplicates, once a match is found on the first join, I would like it to stop. I hope I've explained this okay. Thanks in advance for any help you can provide.
Current Code Example:
proc sql;
create table joined as
select a.*, b.*
from Houeshold as a full join Individual as b
on a.ID1=b.ID1
or a.ID2=b.ID2;
run;
Result
Dataset1ID Dataset2ID
1 a
1 b
2 c
2 d
3 e