Hi,
I have two datasets (one from Excel, one from SAS) that each have first and last name variables. I am trying to join the two datasets on first name and last name:
proc sql;
create table want as
select * from have1 a
inner join have2 b
on (a.have1_first_name=b.have2_first_name) and (a.have1_last_name=b.have1_last_name);
quit;
The join worked on about 2/3 of the dataset, but not the other 1/3. The problem is that I can't figure out why the join isn't working on the remaining 1/3. I've looked some non-joiners up by hand and there are no initials or capitalization issues or anything like that--the values look identical to me, but still aren't recognized as identical by SAS. I need some ideas on how to troubleshoot and figure out the disparity between the values so I can complete the join.
Thanks!
Trim and set cases and recheck would be my suggestion:
proc sql;
create table want as
select * from have1 a
inner join have2 b
on (upper(trim(a.have1_first_name))=upper(trim(b.have2_first_name))) and (upper(trim(a.have1_last_name))=upper(trim(b.have1_last_name)));
quit;
Trim and set cases and recheck would be my suggestion:
proc sql;
create table want as
select * from have1 a
inner join have2 b
on (upper(trim(a.have1_first_name))=upper(trim(b.have2_first_name))) and (upper(trim(a.have1_last_name))=upper(trim(b.have1_last_name)));
quit;
I would agree, though
on upper(cats(a.have1_first_name,a.have1_last_name))=upper(cats(b.have1_first_name,b.have1_last_name))
Should also work.
It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.