Hi, I would like to create separate columns (Line1 to LineN) to split individual sentences in a paragraph (column name is Comments) using SAS arrays. This is how I envisioned it to be:
Comment:
Traffic data for South Entrance is estimated. Traffic counter was inoperable from December 1-7 due to new counter installation. Estimated data generated by taking daily average from December 8-31... |
Line1:
Traffic data for South Entrance is estimated. |
Line2:
Traffic counter was inoperable from December 1-7 due to new counter installation. |
Line3:
Estimated data generated by taking daily average from December 8-31... |
Using the code below, I was able to identify that the maximum number of sentences there is in a Comment value is 17. I assume this will be the dimension for my array (?)
proc sql noprint;
select max(NumSentences) into :max_cnt trimmed
from work.SouthRim;
quit;
%put &=max_cnt;
I am thinking of using count, find and substr functions but do not know how and where to start. Can you please help me? Many thanks in advance!
How would you identify a sentence? Does the sentence end with a period, or can it end with a question mark or exclamation point or emoji? How many sentences are in this example text:
I went to visit Dr. Jones on Feb. 22, 2023
Why does the solution have to use an array (which seems to me to be particularly un-useful in this case)? If a solution was available without arrays, would that work?
Hi, for this case study, a sentence ends with a period only.
The solution doesn't have to be using arrays - it's just the first thing that came to my mind when I thought of a solution. Any approach is welcome! 🙂
data have;
text='Traffic data for South Entrance is estimated. Traffic counter was inoperable from December 1-7 due to new counter installation. Estimated data generated by taking daily average from December 8-31...';
run;
data want;
set have;
length word $ 1024;
word='AAA';
i=0;
do while(not missing(word));
i=i+1;
word=cats(scan(text,i,'.'));
if not missing(word) then output;
end;
drop i text;
run;
Despite the fact that your subject line asks for separate columns, the example you give produces separate rows, so that's what I did.
Hint: if any approach will work, do not specify to use arrays, or to use any other SAS technique
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.