BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
buckeyefisher
Obsidian | Level 7

I want to extract digits from a string

For example a typical string is given as  =  PYRIDOSTIGMINE BROMIDE (NDA #020414)

I want to extract 020414.

I am having hard time telling SAS (pearl functions) to extract after # and stop before ')'

Any ideas?

Thanks

1 ACCEPTED SOLUTION

Accepted Solutions
Tim_SAS
Barite | Level 11

data _null_;

retain re;

if _N_ = 1 then

  re = prxparse("/#(\d+)\)/");

input str $ 1-40;

if prxmatch(re, str) then do;

  num = prxposn(re, 1, str);

end;

put num=;

datalines;

CLARITIN-D 24 HOUR (NDA #020470)

PYRIDOSTIGMINE BROMIDE (NDA #020414)

run;

View solution in original post

10 REPLIES 10
Ksharp
Super User
data _null_ ;
a='PYRIDOSTIGMINE BROMIDE (NDA #020414)';
b=compress(a, ,'kd');
put a= b=;
run;

Xia Keshan

buckeyefisher
Obsidian | Level 7

Xia,

Thanks but it does not give me exact solution. For example I have a string  "CLARITIN-D 24 HOUR (NDA #020470)" And I am looking to extract 020470. But your solution extracts - 24020470.

So I want to use # to signal start of the number and ')' to signal end of the number. Any thoughts on this?

Kiran

Reeza
Super User

Use the scan function if you have a consistent structure to the data.

sample=scan(word, 3, "()#");

SAS(R) 9.2 Language Reference: Dictionary, Fourth Edition

buckeyefisher
Obsidian | Level 7

Thanks Reeza,

although the data structure is almost standard, sometimes there are multiple open and close brackets.

So my best bet is to recognize  symbol '#'. Is there a way to recognize that?

Thanks,

Reeza
Super User

Use the scan function with # only as a delimiter and then again with the brackets or some combination thereof.

Reeza
Super User

data have;

input str $ 1-40;

datalines;

CLARITIN-D 24 HOUR (NDA #020470)

PYRIDOSTIGMINE BROMIDE (NDA #020414)

;

run;

data want;

    set have;

    number=scan(str, 2, "#)");

run;

buckeyefisher
Obsidian | Level 7

Reeza -

based on your previous suggestion I made it work

Number1=scan(str,5,"#(())");

Number2=scan(str,3,"#()");

and then concatenated two columns.

It worked.

Tim_SAS
Barite | Level 11

data _null_;

retain re;

if _N_ = 1 then

  re = prxparse("/#(\d+)\)/");

input str $ 1-40;

if prxmatch(re, str) then do;

  num = prxposn(re, 1, str);

end;

put num=;

datalines;

CLARITIN-D 24 HOUR (NDA #020470)

PYRIDOSTIGMINE BROMIDE (NDA #020414)

run;

buckeyefisher
Obsidian | Level 7

Time@SAS

your syntax works perfectly !!!!

Thanks

Haikuo
Onyx | Level 15

If using PRXCHANGE, the code can be less verbose:

data want;

     input str $ 1-40;

     num=prxchange('s/.+#(\d+).+/$1/io',-1,str);

     cards;

CLARITIN-D 24 HOUR (NDA #020470)

PYRIDOSTIGMINE BROMIDE (NDA #020414)

run;

Haikuo

SAS Innovate 2025: Register Now

Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 10 replies
  • 2014 views
  • 6 likes
  • 5 in conversation