BookmarkSubscribeRSS Feed
☑ This topic is solved. Need further help from the community? Please sign in and ask a new question.
axescot78
Quartz | Level 8

I am trying to merge data from one data set to another based on first and last name. I run the following code:

 

proc sql;
	create table merged_data as
	select *
	from pull_data
	where strip(upcase(last_name)) in (select strip(upcase(NameLast)) from reference_data)
		and strip(upcase(first_name)) in (select strip(upcase(NameFirst)) from reference_data);
quit;

This matches all last names without regard to first name. So in the example data below, fake_l_name_1 will match with fake_f_name_1 and fake_f_name_2, rather than with fake_f_name_1 only. Can anyone tell me why proc sql is ignoring the 'and' statement?

data pull_data;
input userID $4. first_name $14. last_name $14. ;
datalines;
1693 fake_f_name_1 fake_l_name_1
1693 fake_f_name_1 fake_l_name_1
1693 fake_f_name_1 fake_l_name_1
1693 fake_f_name_1 fake_l_name_1
1693 fake_f_name_1 fake_l_name_1
1693 fake_f_name_1 fake_l_name_1
1129 fake_f_name_2 fake_l_name_1
1129 fake_f_name_2 fake_l_name_1
1129 fake_f_name_2 fake_l_name_1
1129 fake_f_name_2 fake_l_name_1
1129 fake_f_name_2 fake_l_name_1
1129 fake_f_name_2 fake_l_name_1
1345 fake_f_name_3 fake_l_name_2
1345 fake_f_name_3 fake_l_name_2
1345 fake_f_name_3 fake_l_name_2
1345 fake_f_name_3 fake_l_name_2
1345 fake_f_name_3 fake_l_name_2
1345 fake_f_name_3 fake_l_name_2
1345 fake_f_name_3 fake_l_name_2
1345 fake_f_name_3 fake_l_name_2
1345 fake_f_name_3 fake_l_name_2
1345 fake_f_name_3 fake_l_name_2
1492 fake_f_name_4 fake_l_name_2
1492 fake_f_name_4 fake_l_name_2
1492 fake_f_name_4 fake_l_name_2
1492 fake_f_name_4 fake_l_name_2
;

data reference_data;
input NameFirst $14. NameLast $14. ;
datalines;
fake_f_name_1	fake_l_name_1
fake_f_name_2	fake_l_name_2
fake_f_name_3	fake_l_name_3
fake_f_name_4	fake_l_name_4
fake_f_name_5	fake_l_name_5
fake_f_name_6	fake_l_name_6
fake_f_name_7	fake_l_name_7
fake_f_name_8	fake_l_name_8
;
1 ACCEPTED SOLUTION

Accepted Solutions
Patrick
Opal | Level 21

Thank you for providing sample data via working SAS code.

 

I assume what you're trying to achieve is an inner join. Code like below should do.

proc sql;
	create table merged_data as
	select a.*
	from pull_data a inner join reference_data b
    on strip(upcase(a.last_name))=strip(upcase(b.NameLast)) 
      and strip(upcase(a.first_name))=strip(upcase(b.NameFirst))
  ;
quit;

The filter condition you defined does independent lookups in table reference_data meaning the condition gets true if the first and last name in your base table exist anywhere in the lookup table reference_data - it doesn't need to be on the same row.

 

View solution in original post

1 REPLY 1
Patrick
Opal | Level 21

Thank you for providing sample data via working SAS code.

 

I assume what you're trying to achieve is an inner join. Code like below should do.

proc sql;
	create table merged_data as
	select a.*
	from pull_data a inner join reference_data b
    on strip(upcase(a.last_name))=strip(upcase(b.NameLast)) 
      and strip(upcase(a.first_name))=strip(upcase(b.NameFirst))
  ;
quit;

The filter condition you defined does independent lookups in table reference_data meaning the condition gets true if the first and last name in your base table exist anywhere in the lookup table reference_data - it doesn't need to be on the same row.

 

sas-innovate-wordmark-2025-midnight.png

Register Today!

Join us for SAS Innovate 2025, our biggest and most exciting global event of the year, in Orlando, FL, from May 6-9. Sign up by March 14 for just $795.


Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 1 reply
  • 521 views
  • 0 likes
  • 2 in conversation