BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
buckeyefisher
Obsidian | Level 7

I want to extract digits from a string

For example a typical string is given as  =  PYRIDOSTIGMINE BROMIDE (NDA #020414)

I want to extract 020414.

I am having hard time telling SAS (pearl functions) to extract after # and stop before ')'

Any ideas?

Thanks

1 ACCEPTED SOLUTION

Accepted Solutions
Tim_SAS
Barite | Level 11

data _null_;

retain re;

if _N_ = 1 then

  re = prxparse("/#(\d+)\)/");

input str $ 1-40;

if prxmatch(re, str) then do;

  num = prxposn(re, 1, str);

end;

put num=;

datalines;

CLARITIN-D 24 HOUR (NDA #020470)

PYRIDOSTIGMINE BROMIDE (NDA #020414)

run;

View solution in original post

10 REPLIES 10
Ksharp
Super User
data _null_ ;
a='PYRIDOSTIGMINE BROMIDE (NDA #020414)';
b=compress(a, ,'kd');
put a= b=;
run;

Xia Keshan

buckeyefisher
Obsidian | Level 7

Xia,

Thanks but it does not give me exact solution. For example I have a string  "CLARITIN-D 24 HOUR (NDA #020470)" And I am looking to extract 020470. But your solution extracts - 24020470.

So I want to use # to signal start of the number and ')' to signal end of the number. Any thoughts on this?

Kiran

Reeza
Super User

Use the scan function if you have a consistent structure to the data.

sample=scan(word, 3, "()#");

SAS(R) 9.2 Language Reference: Dictionary, Fourth Edition

buckeyefisher
Obsidian | Level 7

Thanks Reeza,

although the data structure is almost standard, sometimes there are multiple open and close brackets.

So my best bet is to recognize  symbol '#'. Is there a way to recognize that?

Thanks,

Reeza
Super User

Use the scan function with # only as a delimiter and then again with the brackets or some combination thereof.

Reeza
Super User

data have;

input str $ 1-40;

datalines;

CLARITIN-D 24 HOUR (NDA #020470)

PYRIDOSTIGMINE BROMIDE (NDA #020414)

;

run;

data want;

    set have;

    number=scan(str, 2, "#)");

run;

buckeyefisher
Obsidian | Level 7

Reeza -

based on your previous suggestion I made it work

Number1=scan(str,5,"#(())");

Number2=scan(str,3,"#()");

and then concatenated two columns.

It worked.

Tim_SAS
Barite | Level 11

data _null_;

retain re;

if _N_ = 1 then

  re = prxparse("/#(\d+)\)/");

input str $ 1-40;

if prxmatch(re, str) then do;

  num = prxposn(re, 1, str);

end;

put num=;

datalines;

CLARITIN-D 24 HOUR (NDA #020470)

PYRIDOSTIGMINE BROMIDE (NDA #020414)

run;

buckeyefisher
Obsidian | Level 7

Time@SAS

your syntax works perfectly !!!!

Thanks

Haikuo
Onyx | Level 15

If using PRXCHANGE, the code can be less verbose:

data want;

     input str $ 1-40;

     num=prxchange('s/.+#(\d+).+/$1/io',-1,str);

     cards;

CLARITIN-D 24 HOUR (NDA #020470)

PYRIDOSTIGMINE BROMIDE (NDA #020414)

run;

Haikuo

sas-innovate-2024.png

Don't miss out on SAS Innovate - Register now for the FREE Livestream!

Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 10 replies
  • 1484 views
  • 6 likes
  • 5 in conversation