data names ;
length company_name $ 50;
infile cards dlm='~' ;
input company_name $ country $;
cards ;
lenevo pvt ltd~usa
pvt lene~usa
harish industries~india
institute of harish technology~india
bata showroom ltd~usa
multi theature of bata's showroom~india
run;
I have created one data set like above . I have some records like company names so how to identify the
same sound spelling words for example : see first and second record 1--lenevo 2---lene new variable='lene'.
for your reference see final dataset output:
company_name country match_spelling
lenevo pvt ltd usa lene
pvt lene usa lene
harish industries india harish
institute of harish technology india harish
bata showroom ltd usa bata
multi theature of bata's showroom india bata
You can look at the documentation for the SOUNDEX function.
That creates an "encoded" version of the string that can be compared to an encoded version of another string to see if they are the same
data example; string = 'banana'; str2 = 'Bannnannna'; a=soundex(string); b=soundex(str2); put a= b=; run;
Read the documentation for a bit of how the algorithm works.
Cross language sounds are likely not going to be consistent as only one language's "sounds" have rules for encoding.
Since your specific example includes things that do not sound alike because the number of syllables changes: lene lenevo it may be that you want more of a "closeness of similar spelling" which would be functions COMPGED, COMPLEV or SPEDIS that compare the spelling and score the difference. Smaller scores being closer in similar spelling.
data example; word1 = 'lene'; word2 = 'lenevo'; a = compged(word1,word2); b = complev(word1,word2); c = spedis (word1,word2); put a= b= c=; run;
You would provide additional rules for "how close is close enough".
Join us for SAS Innovate 2025, our biggest and most exciting global event of the year, in Orlando, FL, from May 6-9. Sign up by March 14 for just $795.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.