I am trying to extract a substring from a field that has up to 2000 characters. I need to capture whatever is between "Destination(s):" and "." The following works, but is there a more elegant way to do this? I feel like I'm missing something obvious.
Case
When FIND(t1.Comment_ID,"Destination(s):") > 1 then STRIP(SUBSTR(t1.Comment_ID,FIND(t1.Comment_ID,"Destination(s):")+16,FIND(SUBSTR(t1.Comment_ID,FIND(t1.Comment_ID,"Destination(s):")+16,200),".")-1))
Else ""
End
Thanks,
Lisa
Another alternative is to use PRXPARSE. Here is an example.
data parse_data;
retain re;
text = 'this is some text. Destination(s): more text.';
if(_N_ = 1) then re = PRXPARSE("/(?<=Destination\(s\):)(.*)(?=\.)/io");
if(prxmatch(re, text) ) then new_text = strip(prxposn(re, 1, text) );
drop re;
run;
proc print;
run;
Another alternative is to use PRXPARSE. Here is an example.
data parse_data;
retain re;
text = 'this is some text. Destination(s): more text.';
if(_N_ = 1) then re = PRXPARSE("/(?<=Destination\(s\):)(.*)(?=\.)/io");
if(prxmatch(re, text) ) then new_text = strip(prxposn(re, 1, text) );
drop re;
run;
proc print;
run;
Thanks, this is along the lines of what I was, it just totally slipped my mind. Appreciate it!
I would like to mention that if you have data like this:
text = 'this is some text. Destination(s): more text.xxxxx.';
And you could get this :
It that what you are looking for?
If I were you ,I would prefer to use Tom 's code .
One simple thing to make it easier is to use a data step instead of SQL code.
If SQL you could store the location into another variable and use the CALCULATED keyword to reference it.
If you want everything up to a period then use SCAN().
....
find(t1.Comment_ID,"Destination(s):") as loc
, case when (calculated loc > 0)
then left(scan(substr(t1.Comment_ID,calculated loc+16),1,'.'))
else ' '
end as WANT length=2000
options parmcards=x;
filename x temp;
parmcards;
whatever Destination(s): and and . The following and works, but is there a mor Destination(s): or or . The following
whatever Destination(s): and and . The following and works, but is there a mor Destination(s): or or . The following
;
data want;
infile x length=len;
input x $varying2000. len;
n+1;
pid=prxparse('/Destination\(s\):.+?\./i');
s=1;e=length(x);
call prxnext(pid,s,e,x,p,l);
do while(p>0);
want=scan(substr(x,p,l),2,':.');
output;
call prxnext(pid,s,e,x,p,l);
end;
keep n want;
run;
It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.