I have the following scenario:
I have to find if the string "text" has more than 4 spaces then I want to replace the fourth "space" with the string "[ESC]". Otherwise, we don't insert any strings. If replacing is complicated, I am okay with inserting a string after the fourth space. Thank you for your help.
Example: "I am happy to[ESC]be part of this group"
data have;
text = "I am happy to be part of this group";
output;
text = "This SAS Group is Amazing";
output;
text = " Thank you";
output;
run;
data have1;
set have;
fourth_word=scan(text,4,' ');
location_of_fourth_word=findw(text,cats(' ',fourth_word));
length_of_fourth_word=length(fourth_word);
if not missing(fourth_word) then
replace_text=cats(substr(text,1,location_of_fourth_word+length_of_fourth_word-1),'[ESC]',
substr(text,location_of_fourth_word+length_of_Fourth_word));
else replace_text=text;
run;
data have1;
set have;
fourth_word=scan(text,4,' ');
location_of_fourth_word=findw(text,cats(' ',fourth_word));
length_of_fourth_word=length(fourth_word);
if not missing(fourth_word) then
replace_text=cats(substr(text,1,location_of_fourth_word+length_of_fourth_word-1),'[ESC]',
substr(text,location_of_fourth_word+length_of_Fourth_word));
else replace_text=text;
run;
Thank you.
Can you show the code you have tried?
This blog from leonid Batkhan might help:
https://blogs.sas.com/content/sgf/2019/06/26/finding-n-th-instance-of-a-substring-within-a-string/
Another approach would be to use the CALL SCAN routine, which can return the position of the 5th word.
data have;
text = "I am happy to be part of this group";
output;
text = "This SAS Group is Amazing";
output;
text = " Thank you";
output;
run;
data want;
set have;
want=prxchange('s/(^\S+\s+\S+\s+\S+\s+\S+)\s+/\1[ESC]/',1,left(text));
run;
Thank you for your time @Ksharp . That's awesome which can be done in one step 😮. what's the 'S+' and 's+' signifies in the prxchange ?
Thank you @PaigeMiller @ErikLund_Jensen @mkeintz @Ksharp . All of you guys are amazing. unfortunately I have to chose one as Answer😑.
Hi @SASuserlot
This is using a prxchange similar to the one provided by @Ksharp , but with an added check on the number of words in the string. This is to avoid insert of {ESC] at the end of the string, if there are exactly 4 words.
data have;
text = "I am happy to be part of this group";
output;
text = "This SAS Group is Amazing";
output;
text = "Thank you very much";
output;
text = "Thanks for everything";
output;
text = " Thank you";
output;
run;
data want;
length text $200;
set have;
if countw(text,' ') > 4 then
text = prxchange('s/(\S+\s+\S+\s+\S+\s+\S+)(\s*)(.*)/$1[ESC]$3/',1,catt(text));
run;
data have;
text = " I am happy to be part of this group";
output;
text = "This SAS Group is Amazing";
output;
text = " Thank you";
output;
run;
data want (drop=_:);
set have;
text=left(compbl(text));
length newtext $40;
if countw(text) > 4 then do;
_four_word_length=length(catx(' ',scan(text,1),scan(text,2),scan(text,3),scan(text,4)))+1;
newtext=cats(substr(text,1,_four_word_length),'[ESC]',substr(text,_four_word_length+1));
end;
run;
Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.