BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
mandonium
Fluorite | Level 6

I need to extract specific numbers from a variable (cmaeno). Some records have one number, others have multiple numbers I need. The numbers that I need are in the "( )" :

 

what_i_have_what_i_need_20190423.JPG

 

I do not no where to start syntax wise.  Any help would be greatly appreciated.

1 ACCEPTED SOLUTION

Accepted Solutions
PGStats
Opal | Level 21

Regular expression matching is well suited for this kind of work:

 

data want;
if not prxId then prxId + prxParse("/\((\d+)\)/");
set have;
start = 1;
stop = length(cmaeno);
call prxNext(prxID, start, stop, cmaeno, pos, len);
do while (pos > 0);
    cmaenum = input(prxPosn(prxId, 1, cmaeno), best.);
    output;
    call prxNext(prxID, start, stop, cmaeno, pos, len);
    end;
drop prxId start stop pos len;
run;

 

EDIT Code corrected, thanks to example provided by @Patrick .

PG

View solution in original post

5 REPLIES 5
PGStats
Opal | Level 21

Regular expression matching is well suited for this kind of work:

 

data want;
if not prxId then prxId + prxParse("/\((\d+)\)/");
set have;
start = 1;
stop = length(cmaeno);
call prxNext(prxID, start, stop, cmaeno, pos, len);
do while (pos > 0);
    cmaenum = input(prxPosn(prxId, 1, cmaeno), best.);
    output;
    call prxNext(prxID, start, stop, cmaeno, pos, len);
    end;
drop prxId start stop pos len;
run;

 

EDIT Code corrected, thanks to example provided by @Patrick .

PG
mandonium
Fluorite | Level 6

I'm getting a blank "cmaenum" column and a blank "text" column:

 

data cm02;
if not prxId then prxId + prxParse("/\((\d+)\)/");
set toc6.cm;
start = 1;
stop = length(cmaeno);
call prxNext(prxID, start, stop, cmaeno, pos, len);
do while (pos > 0);
    cmaenum = input(substr(text, pos, len), best.);
    output;
    call prxNext(prxID, start, stop, cmaeno, pos, len);
    end;
drop prxId start stop pos len;
run;

Here's what I get:

result_1.JPG

PGStats
Opal | Level 21

Please check corrected version above.

PG
mandonium
Fluorite | Level 6

Thank you!  The corrected code worked perfectly.  🙂

Patrick
Opal | Level 21

@mandonium 

Basically the same than what @PGStats already posted while I was still coding. 

 

data have;
  varWithText='abc (5) 23xx(9)abc94(345)4 73xy';
  output;
  stop;
run;

data want(drop=_:);
  set have;
  if _n_=1 then
    do;
      _expID = prxparse('/(?<=\()\d+(?=\))/');
      retain _expID;
    end;
  _start = 1;
  _stop = length(varWithText);

  /* Use PRXNEXT to find the first instance of the pattern, */
  /* then use DO WHILE to find all further instances.       */
  /* PRXNEXT changes the start parameter so that searching  */
  /* begins again after the last match.                     */
  call prxnext(_expID, _start, _stop, varWithText, _pos, _len);

  do while (_pos > 0);
    found = substr(varWithText, _pos, _len);
/*    put found= _pos= length=;*/
    output;
    call prxnext(_expID, _start, _stop, varWithText, _pos, _len);
  end;
run;

 

Code based on sample found here:

http://support.sas.com/documentation/cdl/en/lrdict/64316/HTML/default/viewer.htm#a002295965.htm 

 

SAS Innovate 2025: Call for Content

Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!

Submit your idea!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 5 replies
  • 1779 views
  • 2 likes
  • 3 in conversation