Hi there,
Any one can help me on Perl regular expression:
Example:
data have;
input ID:$1. Words :$50.;
datalines;
A Googled
B G.O.L
C GOL
D "G O L"
;
run;
data test;
set have;
Check=prxmatch("/GOL|G.O.L|G O L/", words);
run;
The result I need is B,C,D but not A by using prxmatch
Thanks for help 🙂
Hi @Suzy_Cat You have answered your question. What's the problem? Are you asking to subset?
data have;
input ID:$1. Words $50.;
datalines;
A Googled
B G.O.L
C GOL
D "G O L"
;
run;
data test;
set have;
if prxmatch("/GOL|G.O.L|G O L/", words);
run;
Hi @Suzy_Cat You have answered your question. What's the problem? Are you asking to subset?
data have;
input ID:$1. Words $50.;
datalines;
A Googled
B G.O.L
C GOL
D "G O L"
;
run;
data test;
set have;
if prxmatch("/GOL|G.O.L|G O L/", words);
run;
whoops my bad, accidently put / instead of | between, no wonder it was not working earlier...
Check=prxmatch("/GOL|G.O.L/G O L/", words);
also there is an extra : when i tested earlier
data have;
input ID:$1. Words :$50.;
datalines;
A Googled
B G.O.L
C GOL
D "G O L"
;
run;
A precise regex would be
data test;
set have;
if prxmatch("/G(\.|\s)?O(\.|\s)?L/", words);
run;
checking for dot and blank whitespace char with ? making the check of the captured buffer optional
Hi @Suzy_Cat A further spice by not having create capture buffer 2 as we can back reference. I'm an idiot sometimes.
So
data have;
input ID:$1. Words $50.;
datalines;
A Googled
B G.O.L
C GOL
D "G O L"
;
run;
data test;
set have;
if prxmatch("/G(\.|\s)?O\1?L/", words);
run;
What's the problem?
Maybe this is simpler?
CHECK = prxmatch('/G\W?O\W?L/', WORDS);
Thank you Chris,
Your suggestion is exactly what I was after...
🙂
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.