I need to extract specific numbers from a variable (cmaeno). Some records have one number, others have multiple numbers I need. The numbers that I need are in the "( )" :
I do not no where to start syntax wise. Any help would be greatly appreciated.
Regular expression matching is well suited for this kind of work:
data want;
if not prxId then prxId + prxParse("/\((\d+)\)/");
set have;
start = 1;
stop = length(cmaeno);
call prxNext(prxID, start, stop, cmaeno, pos, len);
do while (pos > 0);
cmaenum = input(prxPosn(prxId, 1, cmaeno), best.);
output;
call prxNext(prxID, start, stop, cmaeno, pos, len);
end;
drop prxId start stop pos len;
run;
EDIT Code corrected, thanks to example provided by @Patrick .
Regular expression matching is well suited for this kind of work:
data want;
if not prxId then prxId + prxParse("/\((\d+)\)/");
set have;
start = 1;
stop = length(cmaeno);
call prxNext(prxID, start, stop, cmaeno, pos, len);
do while (pos > 0);
cmaenum = input(prxPosn(prxId, 1, cmaeno), best.);
output;
call prxNext(prxID, start, stop, cmaeno, pos, len);
end;
drop prxId start stop pos len;
run;
EDIT Code corrected, thanks to example provided by @Patrick .
I'm getting a blank "cmaenum" column and a blank "text" column:
data cm02;
if not prxId then prxId + prxParse("/\((\d+)\)/");
set toc6.cm;
start = 1;
stop = length(cmaeno);
call prxNext(prxID, start, stop, cmaeno, pos, len);
do while (pos > 0);
cmaenum = input(substr(text, pos, len), best.);
output;
call prxNext(prxID, start, stop, cmaeno, pos, len);
end;
drop prxId start stop pos len;
run;Here's what I get:
Please check corrected version above.
Thank you! The corrected code worked perfectly. 🙂
Basically the same than what @PGStats already posted while I was still coding.
data have;
varWithText='abc (5) 23xx(9)abc94(345)4 73xy';
output;
stop;
run;
data want(drop=_:);
set have;
if _n_=1 then
do;
_expID = prxparse('/(?<=\()\d+(?=\))/');
retain _expID;
end;
_start = 1;
_stop = length(varWithText);
/* Use PRXNEXT to find the first instance of the pattern, */
/* then use DO WHILE to find all further instances. */
/* PRXNEXT changes the start parameter so that searching */
/* begins again after the last match. */
call prxnext(_expID, _start, _stop, varWithText, _pos, _len);
do while (_pos > 0);
found = substr(varWithText, _pos, _len);
/* put found= _pos= length=;*/
output;
call prxnext(_expID, _start, _stop, varWithText, _pos, _len);
end;
run;
Code based on sample found here:
http://support.sas.com/documentation/cdl/en/lrdict/64316/HTML/default/viewer.htm#a002295965.htm
April 27 – 30 | Gaylord Texan | Grapevine, Texas
Walk in ready to learn. Walk out ready to deliver. This is the data and AI conference you can't afford to miss.
Register now and lock in 2025 pricing—just $495!
Still thinking about your presentation idea? The submission deadline has been extended to Friday, Nov. 14, at 11:59 p.m. ET.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.