BookmarkSubscribeRSS Feed
daradanye
Obsidian | Level 7

Hi,

 

I need to use SAS regular expression to extract a substring from some messy strings.  I've compiled regular expression patterns but not quite sure how to do that in SAS.

 

The examples are as follows:

 

HaveWant
a2191625zex-10_14.htm10_14
ade4378dexhibit10_2.txt10_2

 

The regular expression I come up with is (ex\D*)(\d+\D+\d*)

 

I am using two sets of brackets here to group.  In Python or other languages, I can locate what I want by identify group(2).  Is there any similar function here in SAS?

 

I would be very grateful if someone can help out here.

 

Thanks,

Dara

3 REPLIES 3
PGStats
Opal | Level 21

Try prxChange:

 

want = prxChange("/.*ex\D*(\d+\D\d*).*/\1/", 1, have);
PG
Ksharp
Super User
data x;
input have : $40. ;
pid=prxparse('/[\d_]+(?=\.[a-z]+)/io');
if pid>0 then do;
call prxsubstr(pid,have,p,l);
want=substr(have,p,l);
end;
cards;
a2191625zex-10_14.htm
ade4378dexhibit10_2.txt
;
proc print;run;
Ksharp
Super User

OR this Scan():

 

data x;
input have : $40. ;
want=scan(scan(have,1,'.'),-1,'_','kd');
cards;
a2191625zex-10_14.htm
ade4378dexhibit10_2.txt
;
proc print;run;