I need to extract specific numbers from a variable (cmaeno). Some records have one number, others have multiple numbers I need. The numbers that I need are in the "( )" :
I do not no where to start syntax wise. Any help would be greatly appreciated.
Regular expression matching is well suited for this kind of work:
data want;
if not prxId then prxId + prxParse("/\((\d+)\)/");
set have;
start = 1;
stop = length(cmaeno);
call prxNext(prxID, start, stop, cmaeno, pos, len);
do while (pos > 0);
cmaenum = input(prxPosn(prxId, 1, cmaeno), best.);
output;
call prxNext(prxID, start, stop, cmaeno, pos, len);
end;
drop prxId start stop pos len;
run;
EDIT Code corrected, thanks to example provided by @Patrick .
Regular expression matching is well suited for this kind of work:
data want;
if not prxId then prxId + prxParse("/\((\d+)\)/");
set have;
start = 1;
stop = length(cmaeno);
call prxNext(prxID, start, stop, cmaeno, pos, len);
do while (pos > 0);
cmaenum = input(prxPosn(prxId, 1, cmaeno), best.);
output;
call prxNext(prxID, start, stop, cmaeno, pos, len);
end;
drop prxId start stop pos len;
run;
EDIT Code corrected, thanks to example provided by @Patrick .
I'm getting a blank "cmaenum" column and a blank "text" column:
data cm02;
if not prxId then prxId + prxParse("/\((\d+)\)/");
set toc6.cm;
start = 1;
stop = length(cmaeno);
call prxNext(prxID, start, stop, cmaeno, pos, len);
do while (pos > 0);
cmaenum = input(substr(text, pos, len), best.);
output;
call prxNext(prxID, start, stop, cmaeno, pos, len);
end;
drop prxId start stop pos len;
run;
Here's what I get:
Please check corrected version above.
Thank you! The corrected code worked perfectly. 🙂
Basically the same than what @PGStats already posted while I was still coding.
data have;
varWithText='abc (5) 23xx(9)abc94(345)4 73xy';
output;
stop;
run;
data want(drop=_:);
set have;
if _n_=1 then
do;
_expID = prxparse('/(?<=\()\d+(?=\))/');
retain _expID;
end;
_start = 1;
_stop = length(varWithText);
/* Use PRXNEXT to find the first instance of the pattern, */
/* then use DO WHILE to find all further instances. */
/* PRXNEXT changes the start parameter so that searching */
/* begins again after the last match. */
call prxnext(_expID, _start, _stop, varWithText, _pos, _len);
do while (_pos > 0);
found = substr(varWithText, _pos, _len);
/* put found= _pos= length=;*/
output;
call prxnext(_expID, _start, _stop, varWithText, _pos, _len);
end;
run;
Code based on sample found here:
http://support.sas.com/documentation/cdl/en/lrdict/64316/HTML/default/viewer.htm#a002295965.htm
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.