BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
afs
Calcite | Level 5 afs
Calcite | Level 5

i want to read the last sentence in a long text having length 350 .it can be separated by marks, or ! or ? or period. I don't want to use  No leading blanks be there.   I don't know how to develop a code for this . 

 

 

1 ACCEPTED SOLUTION

Accepted Solutions
Shmuel
Garnet | Level 18

As any of the functions: INDEXC , FINDC, SCAN - will miss the delimiter character (. ? ! )

then maybe a slight change to the loop code will do the work:

 

do i = length(trim(paragraph)) - 1 to 1 by -1;

      if substr(paragraph,i,1) in ('.' , '!' , '?')      /* you may add any other characters you want */

         then leave;             /* get out of the loop with i points to end of last_previous sentence */

end;

last_sentence = substr(paragraph, i+1, length(trim(paragraph)) - i );

View solution in original post

11 REPLIES 11
PGStats
Opal | Level 21

You can use either SCAN() or regular expression matching. Here is how to do the later:

 

data have;
infile datalines truncover;
input text $char350.;
datalines;
i want to read the last sentence in a long text having length 350 .it can be separated by marks, or ! or ? or period. I don't want to use  No leading blanks be there.   I don't know how to develop a code for this .
;

data lastSentence;
set have;
length sentence $350;
if not prx then prx + prxParse("/[^.?!]*[.?!]/");
start = 1;
stop=length(text);
call prxNext(prx, start, stop, text, pos, len);
do while(pos>0);
    sentence = substr(text, pos, len);
    call prxNext(prx, start, stop, text, pos, len);
    end;
keep text sentence;
run;

proc print; run;

 

PG
afs
Calcite | Level 5 afs
Calcite | Level 5
It worked -Thanks But the result is showing both the text and the sentence..Is it possible just to see the result ?
PGStats
Opal | Level 21

keep sentence;

 

instead of 

 

keep text sentence;

PG
PGStats
Opal | Level 21

Same, but slightly improved and better tested:

 

data have;
infile datalines truncover;
input text $char350.;
datalines;
i want to read the last sentence in a long text having length 350 .it can be separated by marks, or ! or ? or period. I don't want to use  No leading blanks be there.   I don't know how to develop a code for this .
First sentence. Last sentence with missing punctuation
First sentence!!! Last sentence, with emphatic punctuation ?!

First sentence is also the last.
;

data lastSentence;
set have;
length sentence $350;
if not prx then prx + prxParse("/[^.?!]+([.?!]+|$)/");
start = 1;
stop = length(text);
call prxNext(prx, start, stop, text, pos, len);
do while(pos>0);
    sentence = left(substr(text, pos, len));
    call prxNext(prx, start, stop, text, pos, len);
    end;
keep sentence;
run;

proc print; run;
PG
Shmuel
Garnet | Level 18

Function INDEXC enables look forward for first occurence of a character from a given string characters.

As I couldn't find similar function to look backwards, I shall use a loop searching the end of a last-previous sentence:

 

length paragraph $350;  /* contains your long string */

 

do i = length(trim(paragraph)) to 1 by -1;

      if substr(paragraph,i,1) in ('.' , '!' , '?')      /* you may add any other characters you want */

         then leave;             /* get out of the loop with i points to end of last_previous sentence */

end;

last_sentence = substr(paragraph, i+1, length(trim(paragraph)) - i );

afs
Calcite | Level 5 afs
Calcite | Level 5
Dear Patrick-Thanks - I will study the link
Shmuel
Garnet | Level 18

Thank you @Patrick, I have skiped the right modifier reading the FINDC documentation.

 

this function shorts the code into:

     i = findc(paragraph , '.?!' , B);   /* B modifies to search Backwards */

     sentence = substr(paragraph, i+1,  length(trim(paragraph)) - i );

 

     put sentence=;    /* write the sentence to the log */

Ksharp
Super User

Can't use scan() ?

 

 want = scan(paragraph ,-1, '.?!' );   
PGStats
Opal | Level 21

@Ksharp, you would have to use something like

 

sentence = coalescec(scan(text ,-1, '.?!' ), scan(text ,-2, '.?!' ));

 

but you would still miss the last punctuation character.

PG
Shmuel
Garnet | Level 18

As any of the functions: INDEXC , FINDC, SCAN - will miss the delimiter character (. ? ! )

then maybe a slight change to the loop code will do the work:

 

do i = length(trim(paragraph)) - 1 to 1 by -1;

      if substr(paragraph,i,1) in ('.' , '!' , '?')      /* you may add any other characters you want */

         then leave;             /* get out of the loop with i points to end of last_previous sentence */

end;

last_sentence = substr(paragraph, i+1, length(trim(paragraph)) - i );

hackathon24-white-horiz.png

2025 SAS Hackathon: There is still time!

Good news: We've extended SAS Hackathon registration until Sept. 12, so you still have time to be part of our biggest event yet – our five-year anniversary!

Register Now

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 11 replies
  • 3204 views
  • 0 likes
  • 5 in conversation