BookmarkSubscribeRSS Feed
jatxn
Calcite | Level 5

Hi, I would like to create separate columns (Line1 to LineN) to split individual sentences in a paragraph (column name is Comments) using SAS arrays. This is how I envisioned it to be:

Comment:

 Traffic data for South Entrance is estimated. Traffic counter was inoperable from December 1-7 due to new counter installation. Estimated data generated by taking daily average from December 8-31...

Line1:

 Traffic data for South Entrance is estimated. 

Line2:

 Traffic counter was inoperable from December 1-7 due to new counter installation.

Line3:

 Estimated data generated by taking daily average from December 8-31...

 

Using the code below, I was able to identify that the maximum number of sentences there is in a Comment value is 17. I assume this will be the dimension for my array (?)

proc sql noprint;
select max(NumSentences) into :max_cnt trimmed
from work.SouthRim;
quit;
%put &=max_cnt;

 

I am thinking of using count, find and substr functions but do not know how and where to start. Can you please help me? Many thanks in advance!

 

 

3 REPLIES 3
PaigeMiller
Diamond | Level 26

How would you identify a sentence? Does the sentence end with a period, or can it end with a question mark or exclamation point or emoji? How many sentences are in this example text:

 

I went to visit Dr. Jones on Feb. 22, 2023

Why does the solution have to use an array (which seems to me to be particularly un-useful in this case)? If a solution was available without arrays, would that work?

--
Paige Miller
jatxn
Calcite | Level 5

Hi, for this case study, a sentence ends with a period only.

 

The solution doesn't have to be using arrays - it's just the first thing that came to my mind when I thought of a solution. Any approach is welcome! 🙂

PaigeMiller
Diamond | Level 26
data have;
text='Traffic data for South Entrance is estimated. Traffic counter was inoperable from December 1-7 due to new counter installation. Estimated data generated by taking daily average from December 8-31...';
run;

data want;
    set have;
    length word $ 1024;
    word='AAA';
    i=0;
    do while(not missing(word));
        i=i+1;
        word=cats(scan(text,i,'.'));
        if not missing(word) then output;
	end;
    drop i text;
run;

 

Despite the fact that your subject line asks for separate columns, the example you give produces separate rows, so that's what I did.

 

Hint: if any approach will work, do not specify to use arrays, or to use any other SAS technique

--
Paige Miller

sas-innovate-wordmark-2025-midnight.png

Register Today!

Join us for SAS Innovate 2025, our biggest and most exciting global event of the year, in Orlando, FL, from May 6-9. Sign up by March 14 for just $795.


Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 3 replies
  • 633 views
  • 0 likes
  • 2 in conversation