Perl regular expressions

Accepted Solution Solved
Reply
Super User
Super User
Posts: 7,720
Accepted Solution

Perl regular expressions

Hi,

Am struggling to get something basic working and I can't see why.  Given the string:

{\rtf1\ansi\ansicpg1252\uc1\deff0\deflang1033\deflangfe1033

Why does the following code not give position at all (raw_base contains the above text):

data want;

  set have;

  tmp=prxparse("/rtf/");

  tmp2=prxposn(tmp,1,raw_base);

run;

Had also tried ("/\\rtf/") and ("/ansi/"), none seem to work?


Accepted Solutions
Solution
‎07-31-2014 08:39 AM
Super User
Posts: 9,875

Re: Perl regular expressions

Try this one :

data raw_base;

raw_base='{\rtf1\ansi\ansicpg1252\uc1\deff0\deflang1033\deflangfe1033';

run;

data want;

  set raw_base;

    if _n_=1 then do;tmp=prxparse('/(rtf)/');end;

  if prxmatch(tmp,raw_base) then do;

    tmp2=prxposn(tmp,1,raw_base);

  end;

run;

Xia Keshan

View solution in original post


All Replies
Super User
Posts: 9,875

Re: Perl regular expressions

You missed prxmatch() . Check the documentation to find an example in it .

Valued Guide
Posts: 3,208

Re: Perl regular expressions

As ksharp already made the note, prxmatch must have been executed before prxposn.

The prxparse call is only needed once it seems to be a compilation where after you can use that.

prxposn is using grouping results, fort that the ( ) usage.  See:

SAS(R) 9.4 Functions and CALL Routines: Reference, Second Edition (prx)  and    SAS(R) 9.4 Functions and CALL Routines: Reference, Second Edition (prxposn function not the call)

try:

48         data want;

49 set have;

50           if ( _n_ =1 ) then do; tmp=prxparse('/(rtf)/'); end;

51            tmp1=prxmatch(tmp,raw_base);

52            tmp2=prxposn(tmp,1,raw_base);

53           put "output: " tmp1 tmp2 $8. ;

54         run;

output: 2 rtf    

---->-- ja karman --<-----
Super User
Super User
Posts: 7,720

Re: Perl regular expressions

Thanks Jaap, I couldn't get your example working however.  I am using version 9.3 (not sure if that makes a difference), I copied and pasted and had the first set of errors regarding tmp not being on each row.  So I modified per:

data want;

  if (_n_=1) then do;tmp=prxparse('/rtf/');end;

  set raw_base;

  retain tmp;

  if prxmatch(tmp,raw_base) then do;

    tmp1=prxmatch(tmp,raw_base);

    tmp2=prxposn(tmp,1,raw_base);

  end;

run;

With the above, tmp1 is correct, only the first record has tmp1 not null and it is 3, where my string starts.  However tmp2 is always empty.

Occasional Contributor
Posts: 17

Re: Perl regular expressions

,

prxposn() is to capture buffer which has to be defined using parenthesis (), try this:

data want;

if (_n_=1) then do;tmp=prxparse('/(rtf)/');end;

var='{\rtf1\ansi\ansicpg1252\uc1\deff0\deflang1033\deflangfe1033';

retain tmp;

if prxmatch(tmp,var) then do;

tmp1=prxmatch(tmp,var);

tmp2=prxposn(tmp,1,var);

end;

run;

Haikuo

Super User
Super User
Posts: 7,720

Re: Perl regular expressions

Thank you HaiKuo and Gergely Batho, unfortunately I have run out of helpful answers, but your input was very helpful.

SAS Employee
Posts: 340

Re: Perl regular expressions

Jaap's code is correct. Of course you need the retain statement.

On the other hand you forgot the parenthesis:   prxparse('/rtf/')   prxparse('/(rtf)/')

Without the parenthesis there is no capture buffer, prxposn() returns the content of the capure buffer(s).

But I don't think you need it, since you already know what will be the the capture buffer:  rtf (or missing).

So I suppose this is just an example. Your search string is more complex in your real problem.

Probably:  prxparse('/\\(rtf.*?)\\/')

This will match this string: \rtf1\

Position will be: 2

Capture buffer content will be: rtf1


Solution
‎07-31-2014 08:39 AM
Super User
Posts: 9,875

Re: Perl regular expressions

Try this one :

data raw_base;

raw_base='{\rtf1\ansi\ansicpg1252\uc1\deff0\deflang1033\deflangfe1033';

run;

data want;

  set raw_base;

    if _n_=1 then do;tmp=prxparse('/(rtf)/');end;

  if prxmatch(tmp,raw_base) then do;

    tmp2=prxposn(tmp,1,raw_base);

  end;

run;

Xia Keshan

Valued Guide
Posts: 3,208

Re: Perl regular expressions

Yes, I forgot the retain statement for running all observations.  thx rw9/ batho..

Telling it is compilation reference and not taken action .... (sorry).
It was tested using UE (9.4) but the same approach as in 9.1.3. The grouping ( ) got my attention in the prx part. 

I did mention the difference the difference between prxposn as function and as call.

The first one is returning the string the latter the position (number).

---->-- ja karman --<-----
🔒 This topic is solved and locked.

Need further help from the community? Please ask a new question.

Discussion stats
  • 8 replies
  • 357 views
  • 6 likes
  • 5 in conversation