Hi, I would like to create separate columns (Line1 to LineN) to split individual sentences in a paragraph (column name is Comments) using SAS arrays. This is how I envisioned it to be:
Comment:
Traffic data for South Entrance is estimated. Traffic counter was inoperable from December 1-7 due to new counter installation. Estimated data generated by taking daily average from December 8-31... |
Line1:
Traffic data for South Entrance is estimated. |
Line2:
Traffic counter was inoperable from December 1-7 due to new counter installation. |
Line3:
Estimated data generated by taking daily average from December 8-31... |
Using the code below, I was able to identify that the maximum number of sentences there is in a Comment value is 17. I assume this will be the dimension for my array (?)
proc sql noprint;
select max(NumSentences) into :max_cnt trimmed
from work.SouthRim;
quit;
%put &=max_cnt;
I am thinking of using count, find and substr functions but do not know how and where to start. Can you please help me? Many thanks in advance!
How would you identify a sentence? Does the sentence end with a period, or can it end with a question mark or exclamation point or emoji? How many sentences are in this example text:
I went to visit Dr. Jones on Feb. 22, 2023
Why does the solution have to use an array (which seems to me to be particularly un-useful in this case)? If a solution was available without arrays, would that work?
Hi, for this case study, a sentence ends with a period only.
The solution doesn't have to be using arrays - it's just the first thing that came to my mind when I thought of a solution. Any approach is welcome! 🙂
data have;
text='Traffic data for South Entrance is estimated. Traffic counter was inoperable from December 1-7 due to new counter installation. Estimated data generated by taking daily average from December 8-31...';
run;
data want;
set have;
length word $ 1024;
word='AAA';
i=0;
do while(not missing(word));
i=i+1;
word=cats(scan(text,i,'.'));
if not missing(word) then output;
end;
drop i text;
run;
Despite the fact that your subject line asks for separate columns, the example you give produces separate rows, so that's what I did.
Hint: if any approach will work, do not specify to use arrays, or to use any other SAS technique
Join us for SAS Innovate 2025, our biggest and most exciting global event of the year, in Orlando, FL, from May 6-9. Sign up by March 14 for just $795.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.