Hi Experts,
I have a variable that has different strings seperated by a pipe sign, Say as : VAR1 = STRING1|STRING2|STRING3 and so - on.
Where only one string will contain a KEYWORD --> "EXTRACT_ME".
And I would like to extract only the string that has EXTRACT_ME keyword. (an example is shown below)
I can defenitely achieve it with the help of a DO loop and serach for the KEYWORD in each string one by one and extract it.
But I'm looking for something more effieicnt that really need to put it in a do loop. Maybe Perl Expression?
Experts: Any advise? Appreciate your help.
| INPUT_VAR | OUTPUT_VAR |
| FIRST STRING|EXTRACT_ME[ABC]|SECOND STRING | EXTRACT_ME[ABC] |
| STRING ONE|EXTRACT_ME[ABCDEFG]|STRING THREE|STRING FOUR | EXTRACT_ME[ABCDEFG] |
You can do it with a combination of string functions, but PERL will probably work better.
data have;
string="FIRST STRING|EXTRACT_ME[ABC]|SECOND STRING";output;
string="STRING ONE|EXTRACT_ME[ABCDEFG]|STRING THREE|STRING FOUR";output;
run;
data want;
set have;
loc= index(string, "EXTRACT_ME");
if loc>0 then end= index(substr(string, loc), "|");
want=substr(string, loc, end-1);
run;
You'll need a slight modification, since LOC > 0 should control execution of the remainder of the statements (not just one).
A similar possibility:
if loc > 0 then want = scan(substr(string, loc), 1, '|');
What reason force you to use PRX ?
data have;
string="FIRST STRING|EXTRACT_ME[ABC]|SECOND STRING";output;
string="STRING ONE|EXTRACT_ME[ABCDEFG]|STRING THREE|STRING FOUR";output;
run;
data want;
set have;
length want $ 200;
pid=prxparse('/EXTRACT_ME\[\w+\]/oi');
call prxsubstr(pid,string,p,l);
if p then want=substr(string,p,l);
drop pid p l;
run;
Just in case you don't need the redundant "EXTRACT_ME[]," but only what's in the square brackets behind this keyword, you can modify Ksharp's Perl regular expression for example as follows:
pid=prxparse('/(?<=EXTRACT_ME\[)[^\]]+/oi');
This expression matches a sequence of one or more characters not equal to "]" after the text "EXTRACT_ME[" (case-insensitive).
April 27 – 30 | Gaylord Texan | Grapevine, Texas
Walk in ready to learn. Walk out ready to deliver. This is the data and AI conference you can't afford to miss.
Register now and save with the early bird rate—just $795!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.