I have two data sets as following below.
Data set 1 is a list of company names.
Name
Apple
Amazon
Data set 2 has the following variables.
ID Year1 USname nameO Country Year2
01 1992 Apple Apple | Ford US 1993
02 1993 Alphabet Toyato | Amazon US 1995
03 1994 Amazon.com Amazon US 1996
04 1995 Alphabet Google US 1997
05 1996 IBM IBM US 1998
Name in data set 1 is the same with nameO in data set 2 (Google) or Name in data set1 is partly same with nameO in data set 2 (Apple).
What I want is to match these two data sets as long as NameO is partly same with Name.
Expected
ID Year1 USname nameO Country Year2 name
01 1992 Apple Apple | Ford US 1993 Apple
02 1993 Alphabet Toyato | Amazon US 1995 Amazon
03 1994 Amazon.com Amazon US 1996 Amazon
04 1995 Alphabet Google US 1997 Google
@Ksharp offered the following codes to me. It is helpful.
proc sql;
select *
from b inner join a
on b.nameO contains strip(a.name);
quit;
However, I found another problem.
I found that Names using the upper case such as "APPLE" in data A cannot be matched with NameO using lower cases in data b such as "apple" or "Apple".
Is there any way to ignore the lower case or upper case? Thanks.
proc sql;
select *
from b inner join a
on upper(b.nameO) contains strip(upper(a.name));
quit;
proc sql;
select *
from b inner join a
on upper(b.nameO) contains strip(upper(a.name));
quit;
so this code changes both names as upper cases. Is it correct? Thanks.
it doesnt change the case in the output , but it does convert both to uppercase when it is joining them together
Thank you so much!
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.