Hi,
I have two datasets (one from Excel, one from SAS) that each have first and last name variables. I am trying to join the two datasets on first name and last name:
proc sql;
create table want as
select * from have1 a
inner join have2 b
on (a.have1_first_name=b.have2_first_name) and (a.have1_last_name=b.have1_last_name);
quit;
The join worked on about 2/3 of the dataset, but not the other 1/3. The problem is that I can't figure out why the join isn't working on the remaining 1/3. I've looked some non-joiners up by hand and there are no initials or capitalization issues or anything like that--the values look identical to me, but still aren't recognized as identical by SAS. I need some ideas on how to troubleshoot and figure out the disparity between the values so I can complete the join.
Thanks!
Trim and set cases and recheck would be my suggestion:
proc sql;
create table want as
select * from have1 a
inner join have2 b
on (upper(trim(a.have1_first_name))=upper(trim(b.have2_first_name))) and (upper(trim(a.have1_last_name))=upper(trim(b.have1_last_name)));
quit;
Trim and set cases and recheck would be my suggestion:
proc sql;
create table want as
select * from have1 a
inner join have2 b
on (upper(trim(a.have1_first_name))=upper(trim(b.have2_first_name))) and (upper(trim(a.have1_last_name))=upper(trim(b.have1_last_name)));
quit;
I would agree, though
on upper(cats(a.have1_first_name,a.have1_last_name))=upper(cats(b.have1_first_name,b.have1_last_name))
Should also work.
Good news: We've extended SAS Hackathon registration until Sept. 12, so you still have time to be part of our biggest event yet – our five-year anniversary!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.