BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
mandonium
Fluorite | Level 6

I need to extract specific numbers from a variable (cmaeno). Some records have one number, others have multiple numbers I need. The numbers that I need are in the "( )" :

 

what_i_have_what_i_need_20190423.JPG

 

I do not no where to start syntax wise.  Any help would be greatly appreciated.

1 ACCEPTED SOLUTION

Accepted Solutions
PGStats
Opal | Level 21

Regular expression matching is well suited for this kind of work:

 

data want;
if not prxId then prxId + prxParse("/\((\d+)\)/");
set have;
start = 1;
stop = length(cmaeno);
call prxNext(prxID, start, stop, cmaeno, pos, len);
do while (pos > 0);
    cmaenum = input(prxPosn(prxId, 1, cmaeno), best.);
    output;
    call prxNext(prxID, start, stop, cmaeno, pos, len);
    end;
drop prxId start stop pos len;
run;

 

EDIT Code corrected, thanks to example provided by @Patrick .

PG

View solution in original post

5 REPLIES 5
PGStats
Opal | Level 21

Regular expression matching is well suited for this kind of work:

 

data want;
if not prxId then prxId + prxParse("/\((\d+)\)/");
set have;
start = 1;
stop = length(cmaeno);
call prxNext(prxID, start, stop, cmaeno, pos, len);
do while (pos > 0);
    cmaenum = input(prxPosn(prxId, 1, cmaeno), best.);
    output;
    call prxNext(prxID, start, stop, cmaeno, pos, len);
    end;
drop prxId start stop pos len;
run;

 

EDIT Code corrected, thanks to example provided by @Patrick .

PG
mandonium
Fluorite | Level 6

I'm getting a blank "cmaenum" column and a blank "text" column:

 

data cm02;
if not prxId then prxId + prxParse("/\((\d+)\)/");
set toc6.cm;
start = 1;
stop = length(cmaeno);
call prxNext(prxID, start, stop, cmaeno, pos, len);
do while (pos > 0);
    cmaenum = input(substr(text, pos, len), best.);
    output;
    call prxNext(prxID, start, stop, cmaeno, pos, len);
    end;
drop prxId start stop pos len;
run;

Here's what I get:

result_1.JPG

PGStats
Opal | Level 21

Please check corrected version above.

PG
mandonium
Fluorite | Level 6

Thank you!  The corrected code worked perfectly.  🙂

Patrick
Opal | Level 21

@mandonium 

Basically the same than what @PGStats already posted while I was still coding. 

 

data have;
  varWithText='abc (5) 23xx(9)abc94(345)4 73xy';
  output;
  stop;
run;

data want(drop=_:);
  set have;
  if _n_=1 then
    do;
      _expID = prxparse('/(?<=\()\d+(?=\))/');
      retain _expID;
    end;
  _start = 1;
  _stop = length(varWithText);

  /* Use PRXNEXT to find the first instance of the pattern, */
  /* then use DO WHILE to find all further instances.       */
  /* PRXNEXT changes the start parameter so that searching  */
  /* begins again after the last match.                     */
  call prxnext(_expID, _start, _stop, varWithText, _pos, _len);

  do while (_pos > 0);
    found = substr(varWithText, _pos, _len);
/*    put found= _pos= length=;*/
    output;
    call prxnext(_expID, _start, _stop, varWithText, _pos, _len);
  end;
run;

 

Code based on sample found here:

http://support.sas.com/documentation/cdl/en/lrdict/64316/HTML/default/viewer.htm#a002295965.htm 

 

hackathon24-white-horiz.png

2025 SAS Hackathon: There is still time!

Good news: We've extended SAS Hackathon registration until Sept. 12, so you still have time to be part of our biggest event yet – our five-year anniversary!

Register Now

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 5 replies
  • 2396 views
  • 2 likes
  • 3 in conversation