I have the following scenario:
I have to find if the string "text" has more than 4 spaces then I want to replace the fourth "space" with the string "[ESC]". Otherwise, we don't insert any strings. If replacing is complicated, I am okay with inserting a string after the fourth space. Thank you for your help.
Example: "I am happy to[ESC]be part of this group"
data have;
text = "I am happy to be part of this group";
output;
text = "This SAS Group is Amazing";
output;
text = " Thank you";
output;
run;
data have1;
set have;
fourth_word=scan(text,4,' ');
location_of_fourth_word=findw(text,cats(' ',fourth_word));
length_of_fourth_word=length(fourth_word);
if not missing(fourth_word) then
replace_text=cats(substr(text,1,location_of_fourth_word+length_of_fourth_word-1),'[ESC]',
substr(text,location_of_fourth_word+length_of_Fourth_word));
else replace_text=text;
run;
data have1;
set have;
fourth_word=scan(text,4,' ');
location_of_fourth_word=findw(text,cats(' ',fourth_word));
length_of_fourth_word=length(fourth_word);
if not missing(fourth_word) then
replace_text=cats(substr(text,1,location_of_fourth_word+length_of_fourth_word-1),'[ESC]',
substr(text,location_of_fourth_word+length_of_Fourth_word));
else replace_text=text;
run;
Thank you.
Can you show the code you have tried?
This blog from leonid Batkhan might help:
https://blogs.sas.com/content/sgf/2019/06/26/finding-n-th-instance-of-a-substring-within-a-string/
Another approach would be to use the CALL SCAN routine, which can return the position of the 5th word.
data have;
text = "I am happy to be part of this group";
output;
text = "This SAS Group is Amazing";
output;
text = " Thank you";
output;
run;
data want;
set have;
want=prxchange('s/(^\S+\s+\S+\s+\S+\s+\S+)\s+/\1[ESC]/',1,left(text));
run;
Thank you for your time @Ksharp . That's awesome which can be done in one step 😮. what's the 'S+' and 's+' signifies in the prxchange ?
Thank you @PaigeMiller @ErikLund_Jensen @mkeintz @Ksharp . All of you guys are amazing. unfortunately I have to chose one as Answer😑.
Hi @SASuserlot
This is using a prxchange similar to the one provided by @Ksharp , but with an added check on the number of words in the string. This is to avoid insert of {ESC] at the end of the string, if there are exactly 4 words.
data have;
text = "I am happy to be part of this group";
output;
text = "This SAS Group is Amazing";
output;
text = "Thank you very much";
output;
text = "Thanks for everything";
output;
text = " Thank you";
output;
run;
data want;
length text $200;
set have;
if countw(text,' ') > 4 then
text = prxchange('s/(\S+\s+\S+\s+\S+\s+\S+)(\s*)(.*)/$1[ESC]$3/',1,catt(text));
run;
data have;
text = " I am happy to be part of this group";
output;
text = "This SAS Group is Amazing";
output;
text = " Thank you";
output;
run;
data want (drop=_:);
set have;
text=left(compbl(text));
length newtext $40;
if countw(text) > 4 then do;
_four_word_length=length(catx(' ',scan(text,1),scan(text,2),scan(text,3),scan(text,4)))+1;
newtext=cats(substr(text,1,_four_word_length),'[ESC]',substr(text,_four_word_length+1));
end;
run;
Registration is open! SAS is returning to Vegas for an AI and analytics experience like no other! Whether you're an executive, manager, end user or SAS partner, SAS Innovate is designed for everyone on your team. Register for just $495 by 12/31/2023.
If you are interested in speaking, there is still time to submit a session idea. More details are posted on the website.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.