I need to extract specific numbers from a variable (cmaeno). Some records have one number, others have multiple numbers I need. The numbers that I need are in the "( )" :
I do not no where to start syntax wise. Any help would be greatly appreciated.
Regular expression matching is well suited for this kind of work:
data want;
if not prxId then prxId + prxParse("/\((\d+)\)/");
set have;
start = 1;
stop = length(cmaeno);
call prxNext(prxID, start, stop, cmaeno, pos, len);
do while (pos > 0);
cmaenum = input(prxPosn(prxId, 1, cmaeno), best.);
output;
call prxNext(prxID, start, stop, cmaeno, pos, len);
end;
drop prxId start stop pos len;
run;
EDIT Code corrected, thanks to example provided by @Patrick .
Regular expression matching is well suited for this kind of work:
data want;
if not prxId then prxId + prxParse("/\((\d+)\)/");
set have;
start = 1;
stop = length(cmaeno);
call prxNext(prxID, start, stop, cmaeno, pos, len);
do while (pos > 0);
cmaenum = input(prxPosn(prxId, 1, cmaeno), best.);
output;
call prxNext(prxID, start, stop, cmaeno, pos, len);
end;
drop prxId start stop pos len;
run;
EDIT Code corrected, thanks to example provided by @Patrick .
I'm getting a blank "cmaenum" column and a blank "text" column:
data cm02;
if not prxId then prxId + prxParse("/\((\d+)\)/");
set toc6.cm;
start = 1;
stop = length(cmaeno);
call prxNext(prxID, start, stop, cmaeno, pos, len);
do while (pos > 0);
cmaenum = input(substr(text, pos, len), best.);
output;
call prxNext(prxID, start, stop, cmaeno, pos, len);
end;
drop prxId start stop pos len;
run;
Here's what I get:
Please check corrected version above.
Thank you! The corrected code worked perfectly. 🙂
Basically the same than what @PGStats already posted while I was still coding.
data have;
varWithText='abc (5) 23xx(9)abc94(345)4 73xy';
output;
stop;
run;
data want(drop=_:);
set have;
if _n_=1 then
do;
_expID = prxparse('/(?<=\()\d+(?=\))/');
retain _expID;
end;
_start = 1;
_stop = length(varWithText);
/* Use PRXNEXT to find the first instance of the pattern, */
/* then use DO WHILE to find all further instances. */
/* PRXNEXT changes the start parameter so that searching */
/* begins again after the last match. */
call prxnext(_expID, _start, _stop, varWithText, _pos, _len);
do while (_pos > 0);
found = substr(varWithText, _pos, _len);
/* put found= _pos= length=;*/
output;
call prxnext(_expID, _start, _stop, varWithText, _pos, _len);
end;
run;
Code based on sample found here:
http://support.sas.com/documentation/cdl/en/lrdict/64316/HTML/default/viewer.htm#a002295965.htm
It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.