I have a string
example 1: abcdefgs:456:fdkweu/:START:41A:45583993:START:60A:6483829992:STOP:59:hagdsu2381ygu:73:
example 2:
hsdgsdgcj:64:54:adcg6edd:START:71A:274765765:START:84A:26357457454:STOP:59:2237fdguef
I want to extract any string between the 1st START and STOP strings i.e. the output should look like
Output 1:
41A:45583993:START:60A:6483829992
Output 2:
71A:274765765:START:84A:26357457454
How do I resolve this? I tried using SCAN function but with no luck.
You can use the INDEX function to find the first position of one string within another.
data have; input string :$100.; datalines; abcdefgs:456:fdkweu/:START:41A:45583993:START:60A:6483829992:STOP:59:hagdsu2381ygu:73: hsdgsdgcj:64:54:adcg6edd:START:71A:274765765:START:84A:26357457454:STOP:59:2237fdguef ; data want; set have; startpos=index(string,'START:'); endpos =index(string,':STOP:'); length newstring $ 100; newstring = substr(string,startpos+6,endpos-(startpos+7)); run;
I left the variables startpos and endpos as examples. The startpos has to be shifted because is the first position of the "START:" so you need to add 6 characters, the length of "START:" to get the first character following the first occurence. The endpos needs to be adjusted as it is the original position in the string and you need to remove the characters that come before the START:.
The INDEX function will return 0 for the position if the string is not found. Best practice would be to do the SUBSTR only when both Statpos and Endpos are greater than 0.
You don't say what to do if you have a Start without a Stop. That should be considered.
You can use the INDEX function to find the first position of one string within another.
data have; input string :$100.; datalines; abcdefgs:456:fdkweu/:START:41A:45583993:START:60A:6483829992:STOP:59:hagdsu2381ygu:73: hsdgsdgcj:64:54:adcg6edd:START:71A:274765765:START:84A:26357457454:STOP:59:2237fdguef ; data want; set have; startpos=index(string,'START:'); endpos =index(string,':STOP:'); length newstring $ 100; newstring = substr(string,startpos+6,endpos-(startpos+7)); run;
I left the variables startpos and endpos as examples. The startpos has to be shifted because is the first position of the "START:" so you need to add 6 characters, the length of "START:" to get the first character following the first occurence. The endpos needs to be adjusted as it is the original position in the string and you need to remove the characters that come before the START:.
The INDEX function will return 0 for the position if the string is not found. Best practice would be to do the SUBSTR only when both Statpos and Endpos are greater than 0.
You don't say what to do if you have a Start without a Stop. That should be considered.
data have;
length eg $500;
eg='abcdefgs:456:fdkweu/:START:41A:45583993:START:60A:6483829992:STOP:59:hagdsu2381ygu:73';
output;
eg='hsdgsdgcj:64:54:adcg6edd:START:71A:274765765:START:84A:26357457454:STOP:59:2237fdguef';
output;
run;
data want;
set have;
length want $200;
n=index(eg,'START')+6;
want=substr(eg,n,index(eg,'STOP')-n-1);
drop n;
run;
data have;
input string :$100.;
datalines;
abcdefgs:456:fdkweu/:START:41A:45583993:START:60A:6483829992:STOP:59:hagdsu2381ygu:73:
hsdgsdgcj:64:54:adcg6edd:START:71A:274765765:START:84A:26357457454:STOP:59:2237fdguef
;
data want;
set have;
pid=prxparse('/start.+stop/i');
if prxmatch(pid,string) then do;
call prxsubstr(pid,string,p,l);
want=substr(string,p+6,l-11);
end;
drop pid p l;
run;
Thank you Members, All the solutions worked
Or you simply loop through the string (the method for using SCAN):
data want;
set have;
length newstring $100;
_flag = 0;
do _i = 1 to countw(string,":");
_word = scan(string,_i,":");
if _word = "START" then _flag = 1;
else if _word = "STOP" then _flag = 0;
else if _flag then newstring = catx(":",newstring,_word);
end;
drop _:;
run;
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
SAS' Charu Shankar shares her PROC SQL expertise by showing you how to master the WHERE clause using real winter weather data.
Find more tutorials on the SAS Users YouTube channel.