Hello all,
I want to use PRXPARSE function to get sub strings like these:
'10-30 oz' or '25 lb' or '2.5 - 55 lbs' ,..... ect
I use this
PATTERN =PRXPARSE("/(\d+\.*\d+ *-* *\d+\.*\d+ *)|(\d+\.?\d+-\d+\.?\d+ (lb|oz))|(\d+\.?\d+\.?\d+ (lb|oz))|(\d+\.?\d+? (lb|oz))|(\d+? *(lb|oz))|(\d *- *\d *(lb|oz))|(\d+?.\d+? *(lb|oz))|(\d+.\d+ *- *\d *(lb|oz))|(\d *- *\d+.\d+ *(lb|oz))/");
but this is not good.
please help,
thanks
We need to know what your full text looks like. Also, what issues are occurring in with your current code.
Can you provide a few test strings that cover the range of your input text: things that should match and strings that shouldn't?
Maybe this is good :
"/\d+(\.\d*)?\s{0,2}(-\s{0,2}\d+(\.\d*)?)?\s?(lbs|lb|oz)/i"
(not tested much)
data have;
length x $ 100;
x='10-30 oz';output;
x='25 lb';output;
x='2.5 - 55 lbs';output;
run;
data _null_;
set have;
if prxmatch('/([\s\d\.]+\-)?[\s\d\.]+(oz|lb|lbs)/i',x) then putlog 'Matched';
else putlog 'Not Matched';
run;
Xia, please note that you must name the longer alternatives first, (oz|lbs|lb) for example, because the parser stops at the first match, so that (oz|lb|lbs) will never match the s in lbs.
A comparison
data test;
length str $64;
do str =
"The Wizard of Oz says",
"There are 10-30 oz of potatoes",
"or 25 lb of onions",
"and 2.5 - 55 lbs of lard in this delicious recipe.";
output;
end;
run;
data amtPG;
length subStr $20;
if not prx1 then prx1 +
prxparse("/\d+(\.\d*)?\s{0,2}(-\s{0,2}\d+(\.\d*)?)?\s?(lbs|lb|oz)/i");
set test;
if prxmatch(prx1, str) then do;
subStr = prxposn(prx1, 0, str);
output;
end;
drop prx1;
run;
title "PG's pattern";
proc print data=amtPG noobs; run;
data amtXK;
length subStr $20;
if not prx1 then prx1 +
prxparse('/([\s\d\.]+\-)?[\s\d\.]+(oz|lb|lbs)/i');
set test;
if prxmatch(prx1, str) then do;
subStr = prxposn(prx1, 0, str);
output;
end;
drop prx1;
run;
title "XK's pattern";
proc print data=amtXk noobs; run;
PG's pattern subStr str 10-30 oz There are 10-30 oz of potatoes 25 lb or 25 lb of onions 2.5 - 55 lbs and 2.5 - 55 lbs of lard in this delicious recipe. XK's pattern subStr str Oz The Wizard of Oz says 10-30 oz There are 10-30 oz of potatoes 25 lb or 25 lb of onions 2.5 - 55 lb and 2.5 - 55 lbs of lard in this delicious recipe.
Join us for SAS Innovate 2025, our biggest and most exciting global event of the year, in Orlando, FL, from May 6-9. Sign up by March 14 for just $795.
Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.