Re: reg ex extract

Steelers_In_DC · Posted 10-15-2021 09:52 AM

I have a long alpha numeric string with front slash delimiters throughout. Within the string are specific 'case' id numbers. I am trying to get everything in the string after the word 'Case', after that I can use the delimiter to drop everything after that. I'm so out of practice with regex that I haven't been able to get there searching.

any help is appreciated.

Steelers_In_DC · Posted 10-15-2021 10:06 AM

I'm always open to suggestions but in case this can help anyone else I'm going to put my solution below:

/*identifies word case in string*/

data p_case;
set check_case(obs=10 keep=Event_Log_Derived_Details);
if _N_ > 0 then do;
p_case = prxparse("/Case/");
end;
find_case = prxmatch(p_case,Event_Log_Derived_Details);
run;

/*uses place of 'Case' in string to get substring with '/' as the delimiter to extract the desired ID 'case' number*/

data find_case;
set p_case;
case = scan(substr(Event_Log_Derived_Details,find_case),2,'/');
run;

ballardw · Posted 10-15-2021 10:50 AM

A couple of concrete examples and the expected result is always a good idea.

May not even require regex.

happy_sas_kitty · Posted 10-17-2021 10:48 PM

basically what @ballardw said. Can you provide an example of what you have and what you want?

I'm guessing what you have and what you want is somthing like this:

data test;
    x = 'Case123:word/Case456:worrd/Case789:worrrd';
run;
data lines;
    set test;
    Length word $20;
    /* prxmatch returns the 1st found position */
    times = prxmatch('/Case/',x);/* returns 1 */
    times2= countw(x,'Case');
    do i = 1 to times2;
        word = scan(x,i,'Case'); 
        output;
    end;
run;

Afterwards you can handle the extracted "word" variable.

For example, compress(word,,'/'); to remove the '/' at the end.

Thus you don't need to regular expression. Instead, you can try countw function.

Or, if you really want to regex, you can search for the "Call Prxnext" call routine : https://documentation.sas.com/doc/en/pgmsascdc/9.4_3.5/lefunctionsref/n1obc9u7z3225mn1npwnassehff0.h...

data test;
   ExpressionID = prxparse('/((Case[0-9]*:.*?\/)|(Case[0-9]*:.*?$))/');
   text = 'Case123:word/Case456:worrd/Case789:worrrd';
   start = 1;
   stop = length(text);
      /* Use PRXNEXT to find the first instance of the pattern, */
      /* then use DO WHILE to find all further instances.       */
      /* PRXNEXT changes the start parameter so that searching  */
      /* begins again after the last match.                     */
   output; /* for test purpose */
   call prxnext(ExpressionID, start, stop, text, position, length);
   output; /* for test purpose */
      do while (position > 0);
         found = substr(text, position, length);
         call prxnext(ExpressionID, start, stop, text, position, length);
         output;
      end;
run;

Tom · Posted 10-18-2021 12:50 AM

Doesn't sound like something that need regex. Just use INDEX(), SUBSTRN() and SCAN().

data test;
  length string $50 case $10 ;
  input string $50.;
  loc = index(upcase(string),'CASE');
  if loc then case=scan(substrn(string,loc+4),1,'/');
cards;
Blah blah/Case123/Blah blah
Blah blah
;

Obs    string                         case    loc

 1     Blah blah/Case123/Blah blah    123      11
 2     Blah blah                                0

reg ex extract

Re: reg ex extract

Re: reg ex extract

Re: reg ex extract

Re: reg ex extract

Click image to register for webinar

Classroom Training Available!