- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Hi, I wanted to extract pact of the characters from a string which has delimiter '_'. The string is like this:
'abcd_ggg_fff_1234'
My question is: is there a single step to get 'ggg_fff" from that string, or have to do it in two steps?
Thanks!
Accepted Solutions
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Or
data have;
input str $30.;
cards;
abcd_ggg_fff_1234
abcd_ttt_www_1234
qadc_hhh_lll_4321
dret_eee_1278
;
data want;
set have;
call scan(str, 1, p, l,'_');
call scan(str, -1, position, length,'_');
want=substr(str,l+2,position-(l+3));
drop p: l:;
run;
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
please try perl regular expression
data have;
x='abcd_ggg_fff_1234';
y=prxchange('s/(\w+)(ggg_fff)(.\d+)/$2/',-1,x);
run;
Jag
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Thanks, Jagadishkatam! But I have more strings with different middle parts:
'abcd_ggg_fff_1234'
'abcd_ttt_www_1234'
'qadc_hhh_lll_4321'
'dret_eee_1278'
I just want to do the general removal of prefix and suffix for all strings, not for a specific one.
Thank you!
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
HI @leehsin
Call scan
data have;
input str $30.;
cards;
abcd_ggg_fff_1234
abcd_ttt_www_1234
qadc_hhh_lll_4321
dret_eee_1278
;
data want;
set have;
call scan(str, 1, position, length,'_');
substr(str,1,length+1)=' ';
call scan(str, -1, position, length,'_');
substr(str,position-1)=' ';
drop position length;
run;
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Hi, novinosrin, thanks for the solution. So, one step is not feasible. How about I want to keep both the original string and the new string?
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Hi @leehsin My second version gives you both. Please review
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Or
data have;
input str $30.;
cards;
abcd_ggg_fff_1234
abcd_ttt_www_1234
qadc_hhh_lll_4321
dret_eee_1278
;
data want;
set have;
call scan(str, 1, p, l,'_');
call scan(str, -1, position, length,'_');
want=substr(str,l+2,position-(l+3));
drop p: l:;
run;
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Great! This solved my problem! Thank you so much!
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Hello @leehsin Just to accept your challenge, here is a one step solution. It's not any faster or better but I loved your question.
data have;
input str $30.;
cards;
abcd_ggg_fff_1234
abcd_ttt_www_1234
qadc_hhh_lll_4321
dret_eee_1278
;
data want;
set have;
want=substr(str,index(str,'_')+1, findc(str,'_','B')- (index(str,'_')+1));
run;
Also, Call scan is by far much faster than a regular expression for a simple problem like this unless your pattern really needs a regex
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
@leehsin This linear approach is pretty quick
data have;
input str $30.;
cards;
abcd_ggg_fff_1234
abcd_ttt_www_1234
qadc_hhh_lll_4321
dret_eee_1278
;
data want;
set have;
length want $30;
do _n_=2 to countw(str,'_')-1;
want=catx('_',want,scan(str,_n_,'_'));
end;
run;
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Hi, novinosrin,
Thank you for providing more solutions! These versatile ways will serve many kinds of scenarios. Thank you!
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
data have;
input text :$100.;
if countw(text,'_')=4 then y=prxchange('s/(\w+\_)(\w+\_\w+)(\_\d+)/$2/',-1,text);
else if countw(text,'_')=3 then y=prxchange('s/(\w+\_)(\w+)(\_\d+)/$2/',-1,text);
cards;
abcd_ggg_fff_1234
abcd_ttt_www_1234
qadc_hhh_lll_4321
dret_eee_1278
;
Jag
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Jagadishkatam, yours is another good solution. Maybe more codes needed if I want to automatically detect the numbers for countw to use.
Thank you!