Name Matching with SAS

Reply
Contributor
Posts: 49

Name Matching with SAS

I am looking into merging two databases bases on people's names. I'm not looking for exact matches since the two databases are from different organizations and one is a little bit older. One database or Master file is ~4.5 million entries with the other being significantly less. I've looked at a few techniques one being fuzzy matching with compged. Unfortunately the paper I was originally looking at was taking down. I have since being trying the code below that I got from a SAS blog (http://blogs.sas.com/content/sgf/2015/01/27/how-to-perform-a-fuzzy-match-using-sas-functions/). I was originally also using the soundex portion of the code but for some reason I was getting a ridiculous amount of matches. I also did see a technique where you do a proc sql join that makes every combination possible. This probably wouldn't be possible for me because of the size of my databases.

So my question is what is the best technique for name matching with large databases? A link to a tutorial would be very much appreciated.

proc sql;

    create table names.mastermatch_nppes as

        select A.fullname as namemaster, B.full_name as nameMO

        from names.master_nppes2 as A,

            Names.missouri as B

        where A.NPI=B.NPI

;

quit;

Respected Advisor
Posts: 3,156

Re: Name Matching with SAS

Super User
Posts: 19,008

Re: Name Matching with SAS

I like Fried Egg's solution here as well:

New Contributor
Posts: 3

Re: Name Matching with SAS

If you need this on a regular basis you should have a closer look at the SAS/Dataflux offerings regarding Data Quality. In particular they already contain name standardizations and matchings via QKB CI SAS Quality Knowledge Base (QKB) which can save a lot of time/programming effort.

Depending on your installation you can acces them from your SAS session via PROC DQSCHEME: SAS(R) 9.4 Data Quality Server: Reference

Contributor
Posts: 49

Re: Name Matching with SAS

Thanks I'll take a look at that. I'm not sure how often name matching will come up in the project but it seems work checking out.

Contributor
Posts: 49

Re: Name Matching with SAS

Thanks for the suggestions I'll try them out with my data and see how it goes.

Contributor
Posts: 49

Re: Name Matching with SAS

Also if it makes a difference quite a few of the names will not be English.

Ask a Question
Discussion stats
  • 6 replies
  • 459 views
  • 6 likes
  • 4 in conversation