Help using Base SAS procedures

regular expressions (with PERL)

Accepted Solution Solved
Reply
Frequent Contributor
Frequent Contributor
Posts: 133
Accepted Solution

regular expressions (with PERL)

OK all-

So I am trying to figure out regular expressions and I am having a little bit of a challenge.

I am text mining a note field (that can have thousands of characters)-below is just one parameter that I am looking for-

I am looking for it to return a (or any)  value-but it just returns blanks-

Essentially I am looking for certain key phrases-

such as:

lungs are clear

lungs clear

lungs appear clear

Thanks in advance for your help-

Lawrence

data temp2;

set notes;

if _n_=1 then do;

re= prxparse('/lung\w{0,15}clear/');

if prxmatch(re,note_text) then ch=1;

end;

retain re;

run;


Accepted Solutions
Solution
‎07-10-2012 11:58 AM
PROC Star
Posts: 1,100

Re: regular expressions (with PERL)

I believe that the \w metacharacter won't include whitespace, such as blank or tab. Try a period (.) instead, like:

/lung.{0,15}clear/

Also, move your prxmatch function after your end;

Right now, you're only checking the first record.

I believe that your pattern will find "lungclear", which you might not want. If not, change it to {1,15}.

You're absolutely on the right track; you'll have it soon!

View solution in original post


All Replies
Respected Advisor
Posts: 3,777

Re: regular expressions (with PERL)

\w does not find spaces.  This seems to work but I'm no RegXpert.

data temp2;

   input note_text $50.;

   if _n_=1

      then re= prxparse('/lung\w{0,15}|\s{1,5}clear/');

   if prxmatch(re,note_text) then ch=1;

   retain re;

   cards;

lungs are clear

lungs    clear

lungsclear

lungs appear clear

;;;;

   run;

proc print;

   run;

Solution
‎07-10-2012 11:58 AM
PROC Star
Posts: 1,100

Re: regular expressions (with PERL)

I believe that the \w metacharacter won't include whitespace, such as blank or tab. Try a period (.) instead, like:

/lung.{0,15}clear/

Also, move your prxmatch function after your end;

Right now, you're only checking the first record.

I believe that your pattern will find "lungclear", which you might not want. If not, change it to {1,15}.

You're absolutely on the right track; you'll have it soon!

Frequent Contributor
Frequent Contributor
Posts: 133

Re: regular expressions (with PERL)

Thanks both of you!

Both solutions help although TomKari's answer works a tiny bit better.

Another question for you TomKari-

How can I set the first letter of lung to be either upper or lowercase?

Lawrence

🔒 This topic is solved and locked.

Need further help from the community? Please ask a new question.

Discussion stats
  • 3 replies
  • 165 views
  • 3 likes
  • 3 in conversation