i want to read the last sentence in a long text having length 350 .it can be separated by marks, or ! or ? or period. I don't want to use No leading blanks be there. I don't know how to develop a code for this .
As any of the functions: INDEXC , FINDC, SCAN - will miss the delimiter character (. ? ! )
then maybe a slight change to the loop code will do the work:
do i = length(trim(paragraph)) - 1 to 1 by -1;
if substr(paragraph,i,1) in ('.' , '!' , '?') /* you may add any other characters you want */
then leave; /* get out of the loop with i points to end of last_previous sentence */
end;
last_sentence = substr(paragraph, i+1, length(trim(paragraph)) - i );
You can use either SCAN() or regular expression matching. Here is how to do the later:
data have;
infile datalines truncover;
input text $char350.;
datalines;
i want to read the last sentence in a long text having length 350 .it can be separated by marks, or ! or ? or period. I don't want to use No leading blanks be there. I don't know how to develop a code for this .
;
data lastSentence;
set have;
length sentence $350;
if not prx then prx + prxParse("/[^.?!]*[.?!]/");
start = 1;
stop=length(text);
call prxNext(prx, start, stop, text, pos, len);
do while(pos>0);
sentence = substr(text, pos, len);
call prxNext(prx, start, stop, text, pos, len);
end;
keep text sentence;
run;
proc print; run;
keep sentence;
instead of
keep text sentence;
Same, but slightly improved and better tested:
data have;
infile datalines truncover;
input text $char350.;
datalines;
i want to read the last sentence in a long text having length 350 .it can be separated by marks, or ! or ? or period. I don't want to use No leading blanks be there. I don't know how to develop a code for this .
First sentence. Last sentence with missing punctuation
First sentence!!! Last sentence, with emphatic punctuation ?!
First sentence is also the last.
;
data lastSentence;
set have;
length sentence $350;
if not prx then prx + prxParse("/[^.?!]+([.?!]+|$)/");
start = 1;
stop = length(text);
call prxNext(prx, start, stop, text, pos, len);
do while(pos>0);
sentence = left(substr(text, pos, len));
call prxNext(prx, start, stop, text, pos, len);
end;
keep sentence;
run;
proc print; run;
Function INDEXC enables look forward for first occurence of a character from a given string characters.
As I couldn't find similar function to look backwards, I shall use a loop searching the end of a last-previous sentence:
length paragraph $350; /* contains your long string */
do i = length(trim(paragraph)) to 1 by -1;
if substr(paragraph,i,1) in ('.' , '!' , '?') /* you may add any other characters you want */
then leave; /* get out of the loop with i points to end of last_previous sentence */
end;
last_sentence = substr(paragraph, i+1, length(trim(paragraph)) - i );
You can search from right to left with findc()
Thank you @Patrick, I have skiped the right modifier reading the FINDC documentation.
this function shorts the code into:
i = findc(paragraph , '.?!' , B); /* B modifies to search Backwards */
sentence = substr(paragraph, i+1, length(trim(paragraph)) - i );
put sentence=; /* write the sentence to the log */
Can't use scan() ?
want = scan(paragraph ,-1, '.?!' );
@Ksharp, you would have to use something like
sentence = coalescec(scan(text ,-1, '.?!' ), scan(text ,-2, '.?!' ));
but you would still miss the last punctuation character.
As any of the functions: INDEXC , FINDC, SCAN - will miss the delimiter character (. ? ! )
then maybe a slight change to the loop code will do the work:
do i = length(trim(paragraph)) - 1 to 1 by -1;
if substr(paragraph,i,1) in ('.' , '!' , '?') /* you may add any other characters you want */
then leave; /* get out of the loop with i points to end of last_previous sentence */
end;
last_sentence = substr(paragraph, i+1, length(trim(paragraph)) - i );
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.