Hello everyone,
I'm doing an excerise where I'm trying to create a list with names that "sounds alike" och with a different spellibg ("Smith" and "Smythe" or "AAron" and "Aron"). I only have 1 table to look through and tried a very simple excerise where I hard-coded a name on which the program will look at to find names that sounds like it (see below):
There are two names in that table that sounds like Munson (Munksson and Munkson) and therefor a list is created with only those names. However, I want a code that is applied on all the names in that table for a list to be created with those names, whether it's Munson and Munksson or Anderson and Andersson. I've tried to find an example online, but only seem to find examples where they use a soundex on two tables... Has anyone done this before? If so, can anyone help me out a little? 😃
Sincerely,
Betty
You might look at the SPEDIS function, which @Rick_SAS describes in this blog post, Distances between words. His example uses SAS/IML to create a sort of matrix of distances between words.
Also, check out the COMPLEV and COMPGED functions, as described in this blog post from a SAS Tech Support consultant.
If you have the data quality software from SAS, you can use Match Codes (DQMATCH) to determine which names are likely the same or similar.
If you are looking for names that sounds like a given name you can use next code:
%let myname = <any given name>;
proc sql;
title "Names Sound Like &myname";
select name
from table
where name=*"&myname";
quit;
but, if you want to find all groups of name that have same sound-like in a table,
then assume the table contains N names, you will need compare N*(N-1)/2 couple names
and assign a flag if the couple sounds alike or not.
Sound like operator in SQL is =* WHERE EfterNamn =* 'Munson' ;
If you want all unique "sounds like" pairs, join the table to itself on the sound-like relation, and insert a where condition to eliminate duplicates and identical spellings
proc sql;
select a.name,b.name
from
have as a
inner join
have as b
on a.name =* b.name
where a.name < b.name;
quit;
Of course, the sounds-like relation only allow equality/non-equality. It doesn't all any notion of "distance" between a pair of names.
Don't miss out on SAS Innovate - Register now for the FREE Livestream!
Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.