BookmarkSubscribeRSS Feed
☑ This topic is solved. Need further help from the community? Please sign in and ask a new question.
sasphd
Lapis Lazuli | Level 10

Hello, 

I want to match names between tow database. the names are not exactly the same. I want to apprximate and after that make a hand check 

I run this program but it gives no match

proc sql ; 
   create table results as
      select  secid, FUND_NAME_MS, FUND_NAME
        from imf_samplenew , Infra_CRSP 
       where imf_samplenew.FUND_NAME_MS eqt Infra_CRSP.FUND_NAME ;
quit ; 

thanks

1 ACCEPTED SOLUTION

Accepted Solutions
ballardw
Super User

@sasphd wrote:

Hello, 

I want to match names between tow database. the names are not exactly the same. I want to apprximate and after that make a hand check 

I run this program but it gives no match

proc sql ; 
   create table results as
      select  secid, FUND_NAME_MS, FUND_NAME
        from imf_samplenew , Infra_CRSP 
       where imf_samplenew.FUND_NAME_MS eqt Infra_CRSP.FUND_NAME ;
quit ; 

thanks


Of course not, you are requiring some form of equality for the length of the shortest name. If one name is "ABC Co" and the other is "ABC CO" they do not match.

If your differences are of case, like the above then you could try:

  where upcase(imf_samplenew.FUND_NAME_MS) eqt upcase(Infra_CRSP.FUND_NAME) ;

If you think the differences are one name is part of the other you might try

where index(upcase(Longernamevariable),upcase(shorternamevariable))>0

If the differences are more complex I might try

  where compged(imf_samplenew.FUND_NAME_MS,Infra_CRSP.FUND_NAME) < 800 ;

COMPGED is one of the functions that will calculate a "spelling distance" based on some internal rules so minor changes "ABC" and "ABc" have low values .If you have too many really different results reduce the 800, if you don't get matches increase it.

 

View solution in original post

1 REPLY 1
ballardw
Super User

@sasphd wrote:

Hello, 

I want to match names between tow database. the names are not exactly the same. I want to apprximate and after that make a hand check 

I run this program but it gives no match

proc sql ; 
   create table results as
      select  secid, FUND_NAME_MS, FUND_NAME
        from imf_samplenew , Infra_CRSP 
       where imf_samplenew.FUND_NAME_MS eqt Infra_CRSP.FUND_NAME ;
quit ; 

thanks


Of course not, you are requiring some form of equality for the length of the shortest name. If one name is "ABC Co" and the other is "ABC CO" they do not match.

If your differences are of case, like the above then you could try:

  where upcase(imf_samplenew.FUND_NAME_MS) eqt upcase(Infra_CRSP.FUND_NAME) ;

If you think the differences are one name is part of the other you might try

where index(upcase(Longernamevariable),upcase(shorternamevariable))>0

If the differences are more complex I might try

  where compged(imf_samplenew.FUND_NAME_MS,Infra_CRSP.FUND_NAME) < 800 ;

COMPGED is one of the functions that will calculate a "spelling distance" based on some internal rules so minor changes "ABC" and "ABc" have low values .If you have too many really different results reduce the 800, if you don't get matches increase it.

 

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

What is Bayesian Analysis?

Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 1 reply
  • 385 views
  • 1 like
  • 2 in conversation