BookmarkSubscribeRSS Feed
Fluorite | Level 6

Hi there,


I have claimant name (Full name) in 2 variables (coming from 2 sources) with its soundex number separately

Mary Ann Gomes   M65252      Ann Mary Gomes   A55622  


I want to remove claimants that don't match based on the sound-like operators. Above claimant is same, except the middle name is placed differently. Assuming I am not going by splitting the fname,last name or middle, is there a way to treat them as same , programmatically?


I have used compgen, spedis and compare as well, and trying to leverage combination. However, when name is shuffled as in this case, it is still not very useful.






You can't compare the words in your example without splitting them. Then of course you will run into problems where your number of words are different and identifying which is the surname and which are the first and second name.


The SCAN function will allow you to identify the words in your example and then you can say you have a match when the third words are equal and the first word in 1 equals the second word in 2 and the second word in 1  equals the first word in 2.


I suggest you should also be looking at other attributes for matching and not just rely on names because of spelling and formatting differences. For example, date of birth, social security number etc. 

Ready to join fellow brilliant minds for the SAS Hackathon?

Build your skills. Make connections. Enjoy creative freedom. Maybe change the world. Registration is now open through August 30th. Visit the SAS Hackathon homepage.

Register today!
How to choose a machine learning algorithm

Use this tutorial as a handy guide to weigh the pros and cons of these commonly used machine learning algorithms.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 1 reply
  • 2 in conversation