06-10-2016 12:29 PM
I am trying to identify records having a specific word as part of the record. There may be spelling error with alteration in 1-2 characters. A sample dataset is attached for your kind cosnideration. I am trying to flag records having the word "DUODENUM" as well as "DUODENAL" and "DUODENOL". Can somebody help me with this.
Thank you in advance for your kind support.
06-10-2016 12:43 PM
Short of using SOUNDEX (a really horrible approximation of matching words on how they sound) I think this is good case for the SAS Data Managment Advanced suite. The matching capabilities of the (formerly known as) Dataflux data quality products are very advanced and wil be your best bet to solve this. And yes, if you do not have licenced it yet you will have to part with a bit of cash. But that *should* be oke if the benefits outweigh the cost.
06-10-2016 03:10 PM
In order to identify the records having specific word with alteration in 1-2 characters spelling, I used spedis function but I realized that it taking into consideration of the whole statement and the word itself. So I don't know how to apply the matching crieria for alteration in 1-2 characters in the word itself. Sas code used:
data test1; set test; if find(report, 'DUODENUM') then value1=spedis(report, 'DUODENUM'); if value1 ne .; run; proc sort data =test1(obs=1) out =test2 ; by value1 ;run; proc sql; create table test3 as select id,report, (select value1 from test2) as value1, spedis(report, 'DUODENUM') as value2 from test where (calculated value2 - calculated value1) le 2 ; quit;
Thank you in advance for your kind reply.
06-10-2016 11:50 PM
Maybe you could try this: proc sql; create table test3 as select value1, value2 from test as a, test as b where value1 =* value2 ; quit;
06-16-2016 08:25 AM
Can you kindly guide me further as the code given below could not help me to get the expected result.
06-16-2016 08:53 PM
Sorry. I can't see any code. Could you post it as a new topic and Let more people see this topic ?