BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
mandonium
Fluorite | Level 6

I need to extract specific numbers from a variable (cmaeno). Some records have one number, others have multiple numbers I need. The numbers that I need are in the "( )" :

 

what_i_have_what_i_need_20190423.JPG

 

I do not no where to start syntax wise.  Any help would be greatly appreciated.

1 ACCEPTED SOLUTION

Accepted Solutions
PGStats
Opal | Level 21

Regular expression matching is well suited for this kind of work:

 

data want;
if not prxId then prxId + prxParse("/\((\d+)\)/");
set have;
start = 1;
stop = length(cmaeno);
call prxNext(prxID, start, stop, cmaeno, pos, len);
do while (pos > 0);
    cmaenum = input(prxPosn(prxId, 1, cmaeno), best.);
    output;
    call prxNext(prxID, start, stop, cmaeno, pos, len);
    end;
drop prxId start stop pos len;
run;

 

EDIT Code corrected, thanks to example provided by @Patrick .

PG

View solution in original post

5 REPLIES 5
PGStats
Opal | Level 21

Regular expression matching is well suited for this kind of work:

 

data want;
if not prxId then prxId + prxParse("/\((\d+)\)/");
set have;
start = 1;
stop = length(cmaeno);
call prxNext(prxID, start, stop, cmaeno, pos, len);
do while (pos > 0);
    cmaenum = input(prxPosn(prxId, 1, cmaeno), best.);
    output;
    call prxNext(prxID, start, stop, cmaeno, pos, len);
    end;
drop prxId start stop pos len;
run;

 

EDIT Code corrected, thanks to example provided by @Patrick .

PG
mandonium
Fluorite | Level 6

I'm getting a blank "cmaenum" column and a blank "text" column:

 

data cm02;
if not prxId then prxId + prxParse("/\((\d+)\)/");
set toc6.cm;
start = 1;
stop = length(cmaeno);
call prxNext(prxID, start, stop, cmaeno, pos, len);
do while (pos > 0);
    cmaenum = input(substr(text, pos, len), best.);
    output;
    call prxNext(prxID, start, stop, cmaeno, pos, len);
    end;
drop prxId start stop pos len;
run;

Here's what I get:

result_1.JPG

PGStats
Opal | Level 21

Please check corrected version above.

PG
mandonium
Fluorite | Level 6

Thank you!  The corrected code worked perfectly.  🙂

Patrick
Opal | Level 21

@mandonium 

Basically the same than what @PGStats already posted while I was still coding. 

 

data have;
  varWithText='abc (5) 23xx(9)abc94(345)4 73xy';
  output;
  stop;
run;

data want(drop=_:);
  set have;
  if _n_=1 then
    do;
      _expID = prxparse('/(?<=\()\d+(?=\))/');
      retain _expID;
    end;
  _start = 1;
  _stop = length(varWithText);

  /* Use PRXNEXT to find the first instance of the pattern, */
  /* then use DO WHILE to find all further instances.       */
  /* PRXNEXT changes the start parameter so that searching  */
  /* begins again after the last match.                     */
  call prxnext(_expID, _start, _stop, varWithText, _pos, _len);

  do while (_pos > 0);
    found = substr(varWithText, _pos, _len);
/*    put found= _pos= length=;*/
    output;
    call prxnext(_expID, _start, _stop, varWithText, _pos, _len);
  end;
run;

 

Code based on sample found here:

http://support.sas.com/documentation/cdl/en/lrdict/64316/HTML/default/viewer.htm#a002295965.htm 

 

Ready to join fellow brilliant minds for the SAS Hackathon?

Build your skills. Make connections. Enjoy creative freedom. Maybe change the world. Registration is now open through August 30th. Visit the SAS Hackathon homepage.

Register today!
How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 5 replies
  • 1584 views
  • 2 likes
  • 3 in conversation