BookmarkSubscribeRSS Feed
jatxn
Calcite | Level 5

Hi, I would like to create separate columns (Line1 to LineN) to split individual sentences in a paragraph (column name is Comments) using SAS arrays. This is how I envisioned it to be:

Comment:

 Traffic data for South Entrance is estimated. Traffic counter was inoperable from December 1-7 due to new counter installation. Estimated data generated by taking daily average from December 8-31...

Line1:

 Traffic data for South Entrance is estimated. 

Line2:

 Traffic counter was inoperable from December 1-7 due to new counter installation.

Line3:

 Estimated data generated by taking daily average from December 8-31...

 

Using the code below, I was able to identify that the maximum number of sentences there is in a Comment value is 17. I assume this will be the dimension for my array (?)

proc sql noprint;
select max(NumSentences) into :max_cnt trimmed
from work.SouthRim;
quit;
%put &=max_cnt;

 

I am thinking of using count, find and substr functions but do not know how and where to start. Can you please help me? Many thanks in advance!

 

 

3 REPLIES 3
PaigeMiller
Diamond | Level 26

How would you identify a sentence? Does the sentence end with a period, or can it end with a question mark or exclamation point or emoji? How many sentences are in this example text:

 

I went to visit Dr. Jones on Feb. 22, 2023

Why does the solution have to use an array (which seems to me to be particularly un-useful in this case)? If a solution was available without arrays, would that work?

--
Paige Miller
jatxn
Calcite | Level 5

Hi, for this case study, a sentence ends with a period only.

 

The solution doesn't have to be using arrays - it's just the first thing that came to my mind when I thought of a solution. Any approach is welcome! 🙂

PaigeMiller
Diamond | Level 26
data have;
text='Traffic data for South Entrance is estimated. Traffic counter was inoperable from December 1-7 due to new counter installation. Estimated data generated by taking daily average from December 8-31...';
run;

data want;
    set have;
    length word $ 1024;
    word='AAA';
    i=0;
    do while(not missing(word));
        i=i+1;
        word=cats(scan(text,i,'.'));
        if not missing(word) then output;
	end;
    drop i text;
run;

 

Despite the fact that your subject line asks for separate columns, the example you give produces separate rows, so that's what I did.

 

Hint: if any approach will work, do not specify to use arrays, or to use any other SAS technique

--
Paige Miller

SAS Innovate 2025: Save the Date

 SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!

Save the date!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 3 replies
  • 552 views
  • 0 likes
  • 2 in conversation