BookmarkSubscribeRSS Feed
chemicalab
Fluorite | Level 6

Hi all,

Kinda new in the SAS world so i would gladly take your advice. My issue is as following: I dont possess text miner and i am trying to handle a case with text via SAS Base.

So what i want to figure out is to recognize if in a group of rows (5,7,3) the names match, meaning

if i have for example Alex Smith  in row 1 and Smith Alex in row 2 that the program will figure that it is the same name.

In addition there can be a rows where the name is   Smith Alexander  or Smith Alex which is the same name and i would like SAS to recognize that.

That means that if two rows have at least 2 words in common (the total for each row would be lets say 3 words) i would like to find a command so that SAS can consider them the same and therefore place them in the same group.

I hope it makes sense and hope in addition that any advice can be found here.

Thnx in advance

1 REPLY 1
art297
Opal | Level 21

You can find quite a bit on the web if you search for "fuzzy match".  A couple of nice examples, with complete code, can be found at:

http://www.sconsig.com/sastips/tip00000.htm

http://www.sconsig.com/sastips/tip00392.htm

SAS has a number of similarity check functions (e.g., complev, compged, compare, compcost, soundex, spedis and regular expressions).  Look at all of them to see which might work best for you.

Of course, if they could be available to you, text miner and dataflux could save you a lot of development costs and effort.

sas-innovate-2024.png

Don't miss out on SAS Innovate - Register now for the FREE Livestream!

Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.

 

Register now!

How to choose a machine learning algorithm

Use this tutorial as a handy guide to weigh the pros and cons of these commonly used machine learning algorithms.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 1 reply
  • 714 views
  • 0 likes
  • 2 in conversation