BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
mandonium
Fluorite | Level 6

I need to extract specific numbers from a variable (cmaeno). Some records have one number, others have multiple numbers I need. The numbers that I need are in the "( )" :

 

what_i_have_what_i_need_20190423.JPG

 

I do not no where to start syntax wise.  Any help would be greatly appreciated.

1 ACCEPTED SOLUTION

Accepted Solutions
PGStats
Opal | Level 21

Regular expression matching is well suited for this kind of work:

 

data want;
if not prxId then prxId + prxParse("/\((\d+)\)/");
set have;
start = 1;
stop = length(cmaeno);
call prxNext(prxID, start, stop, cmaeno, pos, len);
do while (pos > 0);
    cmaenum = input(prxPosn(prxId, 1, cmaeno), best.);
    output;
    call prxNext(prxID, start, stop, cmaeno, pos, len);
    end;
drop prxId start stop pos len;
run;

 

EDIT Code corrected, thanks to example provided by @Patrick .

PG

View solution in original post

5 REPLIES 5
PGStats
Opal | Level 21

Regular expression matching is well suited for this kind of work:

 

data want;
if not prxId then prxId + prxParse("/\((\d+)\)/");
set have;
start = 1;
stop = length(cmaeno);
call prxNext(prxID, start, stop, cmaeno, pos, len);
do while (pos > 0);
    cmaenum = input(prxPosn(prxId, 1, cmaeno), best.);
    output;
    call prxNext(prxID, start, stop, cmaeno, pos, len);
    end;
drop prxId start stop pos len;
run;

 

EDIT Code corrected, thanks to example provided by @Patrick .

PG
mandonium
Fluorite | Level 6

I'm getting a blank "cmaenum" column and a blank "text" column:

 

data cm02;
if not prxId then prxId + prxParse("/\((\d+)\)/");
set toc6.cm;
start = 1;
stop = length(cmaeno);
call prxNext(prxID, start, stop, cmaeno, pos, len);
do while (pos > 0);
    cmaenum = input(substr(text, pos, len), best.);
    output;
    call prxNext(prxID, start, stop, cmaeno, pos, len);
    end;
drop prxId start stop pos len;
run;

Here's what I get:

result_1.JPG

PGStats
Opal | Level 21

Please check corrected version above.

PG
mandonium
Fluorite | Level 6

Thank you!  The corrected code worked perfectly.  🙂

Patrick
Opal | Level 21

@mandonium 

Basically the same than what @PGStats already posted while I was still coding. 

 

data have;
  varWithText='abc (5) 23xx(9)abc94(345)4 73xy';
  output;
  stop;
run;

data want(drop=_:);
  set have;
  if _n_=1 then
    do;
      _expID = prxparse('/(?<=\()\d+(?=\))/');
      retain _expID;
    end;
  _start = 1;
  _stop = length(varWithText);

  /* Use PRXNEXT to find the first instance of the pattern, */
  /* then use DO WHILE to find all further instances.       */
  /* PRXNEXT changes the start parameter so that searching  */
  /* begins again after the last match.                     */
  call prxnext(_expID, _start, _stop, varWithText, _pos, _len);

  do while (_pos > 0);
    found = substr(varWithText, _pos, _len);
/*    put found= _pos= length=;*/
    output;
    call prxnext(_expID, _start, _stop, varWithText, _pos, _len);
  end;
run;

 

Code based on sample found here:

http://support.sas.com/documentation/cdl/en/lrdict/64316/HTML/default/viewer.htm#a002295965.htm 

 

hackathon24-white-horiz.png

The 2025 SAS Hackathon has begun!

It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.

Latest Updates

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 5 replies
  • 2557 views
  • 2 likes
  • 3 in conversation