Hi Experts,
I have a variable that has different strings seperated by a pipe sign, Say as : VAR1 = STRING1|STRING2|STRING3 and so - on.
Where only one string will contain a KEYWORD --> "EXTRACT_ME".
And I would like to extract only the string that has EXTRACT_ME keyword. (an example is shown below)
I can defenitely achieve it with the help of a DO loop and serach for the KEYWORD in each string one by one and extract it.
But I'm looking for something more effieicnt that really need to put it in a do loop. Maybe Perl Expression?
Experts: Any advise? Appreciate your help.
INPUT_VAR | OUTPUT_VAR |
FIRST STRING|EXTRACT_ME[ABC]|SECOND STRING | EXTRACT_ME[ABC] |
STRING ONE|EXTRACT_ME[ABCDEFG]|STRING THREE|STRING FOUR | EXTRACT_ME[ABCDEFG] |
You can do it with a combination of string functions, but PERL will probably work better.
data have;
string="FIRST STRING|EXTRACT_ME[ABC]|SECOND STRING";output;
string="STRING ONE|EXTRACT_ME[ABCDEFG]|STRING THREE|STRING FOUR";output;
run;
data want;
set have;
loc= index(string, "EXTRACT_ME");
if loc>0 then end= index(substr(string, loc), "|");
want=substr(string, loc, end-1);
run;
You'll need a slight modification, since LOC > 0 should control execution of the remainder of the statements (not just one).
A similar possibility:
if loc > 0 then want = scan(substr(string, loc), 1, '|');
What reason force you to use PRX ? data have; string="FIRST STRING|EXTRACT_ME[ABC]|SECOND STRING";output; string="STRING ONE|EXTRACT_ME[ABCDEFG]|STRING THREE|STRING FOUR";output; run; data want; set have; length want $ 200; pid=prxparse('/EXTRACT_ME\[\w+\]/oi'); call prxsubstr(pid,string,p,l); if p then want=substr(string,p,l); drop pid p l; run;
Just in case you don't need the redundant "EXTRACT_ME[]," but only what's in the square brackets behind this keyword, you can modify Ksharp's Perl regular expression for example as follows:
pid=prxparse('/(?<=EXTRACT_ME\[)[^\]]+/oi');
This expression matches a sequence of one or more characters not equal to "]" after the text "EXTRACT_ME[" (case-insensitive).
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 16. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.