DATA Step, Macro, Functions and more

Identifying records having specific word with alteration in 1-2 characters spelling

Reply
Frequent Contributor
Posts: 96

Identifying records having specific word with alteration in 1-2 characters spelling

Hi there,

I am trying to identify records having a specific word as part of the record. There may be spelling error with alteration in 1-2 characters. A sample dataset is attached for your kind cosnideration. I am trying to flag records having the word "DUODENUM" as well as "DUODENAL" and "DUODENOL". Can somebody help me with this. 

 

Thank you in advance for your kind support.

 

Regards,

Deepak

 

Swain
Super Contributor
Posts: 408

Re: Identifying records having specific word with alteration in 1-2 characters spelling

Hi,

 

Short of using SOUNDEX (a really horrible approximation of matching words on how they sound) I think this is good case for the SAS Data Managment Advanced suite. The matching capabilities of the (formerly known as) Dataflux data quality products are very advanced and wil be your best bet to solve this.  And yes, if you do not have licenced it yet you will have to part with a bit of cash. But that *should* be oke if the benefits outweigh the cost.

 

Regards,

- Jan.

Frequent Contributor
Posts: 96

Re: Identifying records having specific word with alteration in 1-2 characters spelling

Hi there,

In order to identify the records having specific word with alteration in 1-2 characters spelling, I used spedis function but I realized that it taking into consideration of the whole statement and the word itself. So I don't know how to apply the matching crieria for alteration in 1-2 characters in the word itself. Sas code used:

data test1;
set test;
if find(report, 'DUODENUM')  then value1=spedis(report, 'DUODENUM');
if value1 ne .;
run;

proc sort data =test1(obs=1) out =test2 ; by value1 ;run;


proc sql;
create table test3 as 
select id,report,  (select value1 from test2) as value1, spedis(report, 'DUODENUM') as value2
from test 
where (calculated value2 -  calculated value1) le 2 ;
quit;

Thank you in advance for your kind reply.

Regards,

Deepak

Swain
Super User
Posts: 9,681

Re: Identifying records having specific word with alteration in 1-2 characters spelling

Maybe you could try this:


proc sql;
create table test3 as 
select value1, value2
from test as a, test as b
where value1 =* value2 ;
quit;

Frequent Contributor
Posts: 96

Re: Identifying records having specific word with alteration in 1-2 characters spelling

Hi Ksharp,

Can you kindly guide me further as the code given below could not help me to get the expected result. 

Regards,

Deepak

Swain
Super User
Posts: 9,681

Re: Identifying records having specific word with alteration in 1-2 characters spelling

Sorry. I can't see any code. Could you post it as a new topic and  Let more people see this topic ?

Ask a Question
Discussion stats
  • 5 replies
  • 310 views
  • 1 like
  • 3 in conversation