I need to extract specific numbers from a variable (cmaeno). Some records have one number, others have multiple numbers I need. The numbers that I need are in the "( )" :
I do not no where to start syntax wise. Any help would be greatly appreciated.
Regular expression matching is well suited for this kind of work:
data want;
if not prxId then prxId + prxParse("/\((\d+)\)/");
set have;
start = 1;
stop = length(cmaeno);
call prxNext(prxID, start, stop, cmaeno, pos, len);
do while (pos > 0);
cmaenum = input(prxPosn(prxId, 1, cmaeno), best.);
output;
call prxNext(prxID, start, stop, cmaeno, pos, len);
end;
drop prxId start stop pos len;
run;
EDIT Code corrected, thanks to example provided by @Patrick .
Regular expression matching is well suited for this kind of work:
data want;
if not prxId then prxId + prxParse("/\((\d+)\)/");
set have;
start = 1;
stop = length(cmaeno);
call prxNext(prxID, start, stop, cmaeno, pos, len);
do while (pos > 0);
cmaenum = input(prxPosn(prxId, 1, cmaeno), best.);
output;
call prxNext(prxID, start, stop, cmaeno, pos, len);
end;
drop prxId start stop pos len;
run;
EDIT Code corrected, thanks to example provided by @Patrick .
I'm getting a blank "cmaenum" column and a blank "text" column:
data cm02;
if not prxId then prxId + prxParse("/\((\d+)\)/");
set toc6.cm;
start = 1;
stop = length(cmaeno);
call prxNext(prxID, start, stop, cmaeno, pos, len);
do while (pos > 0);
cmaenum = input(substr(text, pos, len), best.);
output;
call prxNext(prxID, start, stop, cmaeno, pos, len);
end;
drop prxId start stop pos len;
run;
Here's what I get:
Please check corrected version above.
Thank you! The corrected code worked perfectly. 🙂
Basically the same than what @PGStats already posted while I was still coding.
data have;
varWithText='abc (5) 23xx(9)abc94(345)4 73xy';
output;
stop;
run;
data want(drop=_:);
set have;
if _n_=1 then
do;
_expID = prxparse('/(?<=\()\d+(?=\))/');
retain _expID;
end;
_start = 1;
_stop = length(varWithText);
/* Use PRXNEXT to find the first instance of the pattern, */
/* then use DO WHILE to find all further instances. */
/* PRXNEXT changes the start parameter so that searching */
/* begins again after the last match. */
call prxnext(_expID, _start, _stop, varWithText, _pos, _len);
do while (_pos > 0);
found = substr(varWithText, _pos, _len);
/* put found= _pos= length=;*/
output;
call prxnext(_expID, _start, _stop, varWithText, _pos, _len);
end;
run;
Code based on sample found here:
http://support.sas.com/documentation/cdl/en/lrdict/64316/HTML/default/viewer.htm#a002295965.htm
Good news: We've extended SAS Hackathon registration until Sept. 12, so you still have time to be part of our biggest event yet – our five-year anniversary!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.