BookmarkSubscribeRSS Feed
BettyLoo
Fluorite | Level 6

Hello everyone,

 

I'm doing an excerise where I'm trying to create a list with names that "sounds alike" och with a different spellibg ("Smith" and "Smythe" or "AAron" and "Aron"). I only have 1 table to look through and tried a very simple excerise where I hard-coded a name on which the program will look at to find names that sounds like it (see below):

 

proc sql;
create table soundex as
SELECT EfterNamn
FROM Person
WHERE SOUNDEX('Munson') = SOUNDEX(EfterNamn);
quit;

 

There are two names in that table that sounds like Munson (Munksson and Munkson) and therefor a list is created with only those names. However, I want a code that is applied on all the names in that table for a list to be created with those names, whether it's Munson and Munksson or Anderson and Andersson. I've tried to find an example online, but only seem to find examples where they use a soundex on two tables... Has anyone done this before? If so, can anyone help me out a little? 😃

 

Sincerely,

Betty

6 REPLIES 6
ChrisHemedinger
Community Manager

You might look at the SPEDIS function, which @Rick_SAS describes in this blog post, Distances between words.  His example uses SAS/IML to create a sort of matrix of distances between words.

 

Also, check out the COMPLEV and COMPGED functions, as described in this blog post from a SAS Tech Support consultant.

 

If you have the data quality software from SAS, you can use Match Codes (DQMATCH) to determine which names are likely the same or similar.

Shmuel
Garnet | Level 18

If you are looking for names that sounds like a given name you can use next code:

 

%let myname = <any given name>;

proc sql;
   title "Names Sound Like &myname";
   select name
      from table
        where name=*"&myname";
quit;



but, if you want to find all groups of name that have same sound-like in a table,

then assume the table contains N names, you will need compare N*(N-1)/2 couple names

and assign a flag if the couple sounds alike or  not.

Cynthia_sas
SAS Super FREQ
And, I found this paper to be a concise description of the different methods:
http://support.sas.com/resources/papers/proceedings12/122-2012.pdf and this is another of my favorites:
http://www.lexjansen.com/nesug/nesug07/ap/ap23.pdf

cynthia
Ksharp
Super User
Sound like operator in SQL is =* 

WHERE EfterNamn =*  'Munson'  ;


mkeintz
Jade | Level 19

If you want all unique "sounds like" pairs, join the table to itself on the sound-like relation, and insert a where condition to eliminate duplicates and identical spellings

 

proc sql;
  select a.name,b.name
  from 
    have as a
  inner join 
    have as b
  on a.name =*  b.name
  where a.name < b.name;
quit;

 

 

 

Of course, the sounds-like relation only allow equality/non-equality.  It doesn't all any notion of "distance" between a pair of names.

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------

SAS INNOVATE 2024

Innovate_SAS_Blue.png

Registration is open! SAS is returning to Vegas for an AI and analytics experience like no other! Whether you're an executive, manager, end user or SAS partner, SAS Innovate is designed for everyone on your team. Register for just $495 by 12/31/2023.

If you are interested in speaking, there is still time to submit a session idea. More details are posted on the website. 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Get the $99 certification deal.jpg

 

 

Back in the Classroom!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 6 replies
  • 1551 views
  • 2 likes
  • 7 in conversation