BookmarkSubscribeRSS Feed
☑ This topic is solved. Need further help from the community? Please sign in and ask a new question.
sasphd
Lapis Lazuli | Level 10

Hello, 

I want to match names between tow database. the names are not exactly the same. I want to apprximate and after that make a hand check 

I run this program but it gives no match

proc sql ; 
   create table results as
      select  secid, FUND_NAME_MS, FUND_NAME
        from imf_samplenew , Infra_CRSP 
       where imf_samplenew.FUND_NAME_MS eqt Infra_CRSP.FUND_NAME ;
quit ; 

thanks

1 ACCEPTED SOLUTION

Accepted Solutions
ballardw
Super User

@sasphd wrote:

Hello, 

I want to match names between tow database. the names are not exactly the same. I want to apprximate and after that make a hand check 

I run this program but it gives no match

proc sql ; 
   create table results as
      select  secid, FUND_NAME_MS, FUND_NAME
        from imf_samplenew , Infra_CRSP 
       where imf_samplenew.FUND_NAME_MS eqt Infra_CRSP.FUND_NAME ;
quit ; 

thanks


Of course not, you are requiring some form of equality for the length of the shortest name. If one name is "ABC Co" and the other is "ABC CO" they do not match.

If your differences are of case, like the above then you could try:

  where upcase(imf_samplenew.FUND_NAME_MS) eqt upcase(Infra_CRSP.FUND_NAME) ;

If you think the differences are one name is part of the other you might try

where index(upcase(Longernamevariable),upcase(shorternamevariable))>0

If the differences are more complex I might try

  where compged(imf_samplenew.FUND_NAME_MS,Infra_CRSP.FUND_NAME) < 800 ;

COMPGED is one of the functions that will calculate a "spelling distance" based on some internal rules so minor changes "ABC" and "ABc" have low values .If you have too many really different results reduce the 800, if you don't get matches increase it.

 

View solution in original post

1 REPLY 1
ballardw
Super User

@sasphd wrote:

Hello, 

I want to match names between tow database. the names are not exactly the same. I want to apprximate and after that make a hand check 

I run this program but it gives no match

proc sql ; 
   create table results as
      select  secid, FUND_NAME_MS, FUND_NAME
        from imf_samplenew , Infra_CRSP 
       where imf_samplenew.FUND_NAME_MS eqt Infra_CRSP.FUND_NAME ;
quit ; 

thanks


Of course not, you are requiring some form of equality for the length of the shortest name. If one name is "ABC Co" and the other is "ABC CO" they do not match.

If your differences are of case, like the above then you could try:

  where upcase(imf_samplenew.FUND_NAME_MS) eqt upcase(Infra_CRSP.FUND_NAME) ;

If you think the differences are one name is part of the other you might try

where index(upcase(Longernamevariable),upcase(shorternamevariable))>0

If the differences are more complex I might try

  where compged(imf_samplenew.FUND_NAME_MS,Infra_CRSP.FUND_NAME) < 800 ;

COMPGED is one of the functions that will calculate a "spelling distance" based on some internal rules so minor changes "ABC" and "ABc" have low values .If you have too many really different results reduce the 800, if you don't get matches increase it.

 

sas-innovate-wordmark-2025-midnight.png

Register Today!

Join us for SAS Innovate 2025, our biggest and most exciting global event of the year, in Orlando, FL, from May 6-9. Sign up by March 14 for just $795.


Register now!

What is Bayesian Analysis?

Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 1 reply
  • 563 views
  • 1 like
  • 2 in conversation