BookmarkSubscribeRSS Feed
jatxn
Calcite | Level 5

Hi, I would like to create separate columns (Line1 to LineN) to split individual sentences in a paragraph (column name is Comments) using SAS arrays. This is how I envisioned it to be:

Comment:

 Traffic data for South Entrance is estimated. Traffic counter was inoperable from December 1-7 due to new counter installation. Estimated data generated by taking daily average from December 8-31...

Line1:

 Traffic data for South Entrance is estimated. 

Line2:

 Traffic counter was inoperable from December 1-7 due to new counter installation.

Line3:

 Estimated data generated by taking daily average from December 8-31...

 

Using the code below, I was able to identify that the maximum number of sentences there is in a Comment value is 17. I assume this will be the dimension for my array (?)

proc sql noprint;
select max(NumSentences) into :max_cnt trimmed
from work.SouthRim;
quit;
%put &=max_cnt;

 

I am thinking of using count, find and substr functions but do not know how and where to start. Can you please help me? Many thanks in advance!

 

 

3 REPLIES 3
PaigeMiller
Diamond | Level 26

How would you identify a sentence? Does the sentence end with a period, or can it end with a question mark or exclamation point or emoji? How many sentences are in this example text:

 

I went to visit Dr. Jones on Feb. 22, 2023

Why does the solution have to use an array (which seems to me to be particularly un-useful in this case)? If a solution was available without arrays, would that work?

--
Paige Miller
jatxn
Calcite | Level 5

Hi, for this case study, a sentence ends with a period only.

 

The solution doesn't have to be using arrays - it's just the first thing that came to my mind when I thought of a solution. Any approach is welcome! 🙂

PaigeMiller
Diamond | Level 26
data have;
text='Traffic data for South Entrance is estimated. Traffic counter was inoperable from December 1-7 due to new counter installation. Estimated data generated by taking daily average from December 8-31...';
run;

data want;
    set have;
    length word $ 1024;
    word='AAA';
    i=0;
    do while(not missing(word));
        i=i+1;
        word=cats(scan(text,i,'.'));
        if not missing(word) then output;
	end;
    drop i text;
run;

 

Despite the fact that your subject line asks for separate columns, the example you give produces separate rows, so that's what I did.

 

Hint: if any approach will work, do not specify to use arrays, or to use any other SAS technique

--
Paige Miller

SAS Innovate 2025: Call for Content

Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!

Submit your idea!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 3 replies
  • 518 views
  • 0 likes
  • 2 in conversation