Solved
Contributor
Posts: 63

# Extract digits

I want to extract digits from a string

For example a typical string is given as  =  PYRIDOSTIGMINE BROMIDE (NDA #020414)

I want to extract 020414.

I am having hard time telling SAS (pearl functions) to extract after # and stop before ')'

Any ideas?

Thanks

Accepted Solutions
Solution
‎06-11-2014 12:57 PM
Super Contributor
Posts: 394

## Re: Extract digits

data _null_;

retain re;

if _N_ = 1 then

re = prxparse("/#(\d+)\)/");

input str \$ 1-40;

if prxmatch(re, str) then do;

num = prxposn(re, 1, str);

end;

put num=;

datalines;

CLARITIN-D 24 HOUR (NDA #020470)

PYRIDOSTIGMINE BROMIDE (NDA #020414)

run;

All Replies
Super User
Posts: 10,787

## Re: Extract digits

```data _null_ ;
a='PYRIDOSTIGMINE BROMIDE (NDA #020414)';
b=compress(a, ,'kd');
put a= b=;
run;

```

Xia Keshan

Contributor
Posts: 63

## Re: Extract digits

Xia,

Thanks but it does not give me exact solution. For example I have a string  "CLARITIN-D 24 HOUR (NDA #020470)" And I am looking to extract 020470. But your solution extracts - 24020470.

So I want to use # to signal start of the number and ')' to signal end of the number. Any thoughts on this?

Kiran

Super User
Posts: 23,776

## Re: Extract digits

Use the scan function if you have a consistent structure to the data.

sample=scan(word, 3, "()#");

SAS(R) 9.2 Language Reference: Dictionary, Fourth Edition

Contributor
Posts: 63

## Re: Extract digits

Thanks Reeza,

although the data structure is almost standard, sometimes there are multiple open and close brackets.

So my best bet is to recognize  symbol '#'. Is there a way to recognize that?

Thanks,

Super User
Posts: 23,776

## Re: Extract digits

Use the scan function with # only as a delimiter and then again with the brackets or some combination thereof.

Super User
Posts: 23,776

## Re: Extract digits

data have;

input str \$ 1-40;

datalines;

CLARITIN-D 24 HOUR (NDA #020470)

PYRIDOSTIGMINE BROMIDE (NDA #020414)

;

run;

data want;

set have;

number=scan(str, 2, "#)");

run;

Contributor
Posts: 63

## Re: Extract digits

Reeza -

Number1=scan(str,5,"#(())");

Number2=scan(str,3,"#()");

and then concatenated two columns.

It worked.

Solution
‎06-11-2014 12:57 PM
Super Contributor
Posts: 394

## Re: Extract digits

data _null_;

retain re;

if _N_ = 1 then

re = prxparse("/#(\d+)\)/");

input str \$ 1-40;

if prxmatch(re, str) then do;

num = prxposn(re, 1, str);

end;

put num=;

datalines;

CLARITIN-D 24 HOUR (NDA #020470)

PYRIDOSTIGMINE BROMIDE (NDA #020414)

run;

Contributor
Posts: 63

Time@SAS

Thanks

Posts: 3,167

## Re: Extract digits

If using PRXCHANGE, the code can be less verbose:

data want;

input str \$ 1-40;

num=prxchange('s/.+#(\d+).+/\$1/io',-1,str);

cards;

CLARITIN-D 24 HOUR (NDA #020470)

PYRIDOSTIGMINE BROMIDE (NDA #020414)

run;

Haikuo

🔒 This topic is solved and locked.