Desktop productivity for business analysts and programmers

regexp_substr pick out gibberish names in text field

Reply
Occasional Contributor
Posts: 9

regexp_substr pick out gibberish names in text field

I am trying to pick out gibberish names (typically a string of consonants) out of a name field I will call TEXT3. I was passed the REGEXP_SUBSTR function by a colleague and got the function to work on another data set. The field I am applying to in my data set is a text field char $40. For some reason I can't get it to work:

 

 

proc sql;

create table check_susp_names as

select TEXT3, case

 

when REGEXP_SUBSTR(trim(TEXT3),'(m)c{1,2}+',1,1,'i') is not null then 0

when REGEXP_SUBSTR(trim(TEXT3),'[bcdfghjklmnpqrstvwxzBCDFGHJKLMNPQRSTVWXZ]{4}+',1,1,'i') is not null then 1

when REGEXP_SUBSTR(trim(TEXT3),'(q)[^u]+',1,1,'i') <>'' then 1

else 0

end as Susp_Ind

from dataset_a;quit;

 

 

 

the error I get is:

ERROR: Function REGEXP_SUBSTR could not be located.

 

Trusted Advisor
Posts: 1,114

Re: regexp_substr pick out gibberish names in text field

Hello @agrocrag14,

 

REGEXP_SUBSTR is an Oracle function, as is explained in this paper. So, you can use it in queries against an Oracle database using the PROC SQL Pass-Through facility, but it is not applicable to SAS datasets. Use PRX functions instead (see references in the paper or the documentation).

 

Ask a Question
Discussion stats
  • 1 reply
  • 264 views
  • 0 likes
  • 2 in conversation