BookmarkSubscribeRSS Feed
elainathewonder
Calcite | Level 5

Is there a way to group together the misspellings of words prior to creating a summary table so when I am searching the data set it uses the one word for all the misspellings (I have all the misspellings listed).

 

Example:

Replacing like for love, likes, liked, liker

1 REPLY 1
rajdeep
Pyrite | Level 9

Hi elainathewonder,

 

Greetings of the day.

 

I have done something for you, just have a check and let me know if you mean this.

 



data test;
 patternID=prxparse("/L\w+E/o");
 input address $80. ;
 position = prxmatch(patternID, address);
 
  if position ^= 0 then address= tranwrd(address,substr(address,POSITION,5),'Love');

 datalines;
Zack Johnson, 153 LirsE Str, Chapel Hill, NC27514
Dan Zack, 67891 64th st, Brea, CA
Sally Johns, 4 Moritz LtreE, Duarte, CA 91010
;

run;

In the above example LirsE, LtreE few words are there which got replaced with a common word 'Love'. So if think like there is some kind of similarity in the misspelling texts you can identify that and parse the same as per example and you are done.

 

 

Please check and let me know if there is any disconnect.

hackathon24-white-horiz.png

2025 SAS Hackathon: There is still time!

Good news: We've extended SAS Hackathon registration until Sept. 12, so you still have time to be part of our biggest event yet – our five-year anniversary!

Register Now

How to choose a machine learning algorithm

Use this tutorial as a handy guide to weigh the pros and cons of these commonly used machine learning algorithms.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 1 reply
  • 823 views
  • 0 likes
  • 2 in conversation