- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Posted 01-23-2020 05:38 PM
(1502 views)
Hi,
I need to use SAS regular expression to extract a substring from some messy strings. I've compiled regular expression patterns but not quite sure how to do that in SAS.
The examples are as follows:
Have | Want |
a2191625zex-10_14.htm | 10_14 |
ade4378dexhibit10_2.txt | 10_2 |
The regular expression I come up with is (ex\D*)(\d+\D+\d*)
I am using two sets of brackets here to group. In Python or other languages, I can locate what I want by identify group(2). Is there any similar function here in SAS?
I would be very grateful if someone can help out here.
Thanks,
Dara
3 REPLIES 3
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Try prxChange:
want = prxChange("/.*ex\D*(\d+\D\d*).*/\1/", 1, have);
PG
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
data x; input have : $40. ; pid=prxparse('/[\d_]+(?=\.[a-z]+)/io'); if pid>0 then do; call prxsubstr(pid,have,p,l); want=substr(have,p,l); end; cards; a2191625zex-10_14.htm ade4378dexhibit10_2.txt ; proc print;run;
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
OR this Scan():
data x;
input have : $40. ;
want=scan(scan(have,1,'.'),-1,'_','kd');
cards;
a2191625zex-10_14.htm
ade4378dexhibit10_2.txt
;
proc print;run;