Text mining and content categorization

How to do a fuzzy comparison of texts?

Accepted Solution Solved
Reply
Occasional Contributor
Posts: 11
Accepted Solution

How to do a fuzzy comparison of texts?

Hello,

what we want to do is to find multiple persones in our dataset.

We do have names and addresses of these persons.

The problem is that there are different ways in spelling names of persons or streets - e.g. with "-" or without / "n" or "nn".

Is there any possibility in SAS to do a "fuzzy" comparison of alphanumeric variables?

Something like "most" (90%, 95%) of two strings is identical ?

Thanks!

badikidiki 


Accepted Solutions
Solution
‎06-11-2015 07:20 AM
Super Contributor
Posts: 334

Re: How to do a fuzzy comparison of texts?

You could look here: http://blogs.sas.com/content/sgf/2015/01/27/how-to-perform-a-fuzzy-match-using-sas-functions/

Or if simple is o.k., you could use for example the compress function to remove dashes, etc. and compare afterwards.

View solution in original post


All Replies
Solution
‎06-11-2015 07:20 AM
Super Contributor
Posts: 334

Re: How to do a fuzzy comparison of texts?

You could look here: http://blogs.sas.com/content/sgf/2015/01/27/how-to-perform-a-fuzzy-match-using-sas-functions/

Or if simple is o.k., you could use for example the compress function to remove dashes, etc. and compare afterwards.

Occasional Contributor
Posts: 11

Re: How to do a fuzzy comparison of texts?

Thank you very much for your helpful answer!

☑ This topic is SOLVED.

Need further help from the community? Please ask a new question.

Discussion stats
  • 2 replies
  • 425 views
  • 0 likes
  • 2 in conversation