<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: Splitting a paragraph into an array of sentences in SAS Procedures</title>
    <link>https://communities.sas.com/t5/SAS-Procedures/Splitting-a-paragraph-into-an-array-of-sentences/m-p/226245#M54016</link>
    <description>Thank you - that's very helpful. Unfortunately, the data is too big to take out of SAS. But how could you create an array? So that instead of having 7 rows in the outpuf file, you still have the 3 rows, but multiple variables in an array string1-stringn? Thanks again</description>
    <pubDate>Fri, 18 Sep 2015 11:58:06 GMT</pubDate>
    <dc:creator>Nadz</dc:creator>
    <dc:date>2015-09-18T11:58:06Z</dc:date>
    <item>
      <title>Splitting a paragraph into an array of sentences</title>
      <link>https://communities.sas.com/t5/SAS-Procedures/Splitting-a-paragraph-into-an-array-of-sentences/m-p/226219#M54012</link>
      <description>Hi, I'd like to split a paragraph into an array of sentences - where delimiter is either . ? ! and consecutive delimiters are treated as one e.g. string=Officials said they had no choice after more than 13,000 people entered the country since Hungary fenced off its border with Serbia earlier this week.Many have been taken by bus to reception centres but some say they plan to walk to neighbouring Slovenia? Huge numbers of people heading north from the Mediterranean have created a political crisis in the European Union? Croatian officials said roads leading to the border crossings had also been shut! The crossing on the main road linking Belgrade and Zagreb - at Bajakovo - appeared to be the only one left open string1=Officials said they had no choice after more than 13,000 people entered the country since Hungary fenced off its border with Serbia earlier this week. string2=Many have been taken by bus to reception centres but some say they plan to walk to neighbouring Slovenia? ... etc Can anyone help?</description>
      <pubDate>Fri, 18 Sep 2015 09:46:25 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Procedures/Splitting-a-paragraph-into-an-array-of-sentences/m-p/226219#M54012</guid>
      <dc:creator>Nadz</dc:creator>
      <dc:date>2015-09-18T09:46:25Z</dc:date>
    </item>
    <item>
      <title>Re: Splitting a paragraph into an array of sentences</title>
      <link>https://communities.sas.com/t5/SAS-Procedures/Splitting-a-paragraph-into-an-array-of-sentences/m-p/226226#M54015</link>
      <description>&lt;P&gt;Well, its a bit tricky in code, an example below. &amp;nbsp;Also you run the risk of a few various, extra spaces, dots in words (for example etc., or e.g.). &amp;nbsp;I would prefer to proces the text at source rather than at receipt.&lt;/P&gt;&lt;P&gt;data have;&lt;BR /&gt;&amp;nbsp; string="Officials said they had no choice after more than 13,000 people entered the country since Hungary fenced off its border with Serbia earlier this week. Many have been taken by bus to reception centres but some say they plan to walk to neighbouring Slovenia? Huge numbers of people heading north from the Mediterranean have created a political crisis in the European Union? Croatian officials said roads leading to the border crossings had also been shut! The crossing on the main road linking Belgrade and Zagreb - at Bajakovo - appeared to be the only one left open";&lt;BR /&gt;&amp;nbsp; output;&lt;BR /&gt;&amp;nbsp; string="Officials said they had no choice after more than 13,000 people entered the country since Hungary fenced off its border with Serbia earlier this week.";&lt;BR /&gt;&amp;nbsp; output;&lt;BR /&gt;&amp;nbsp; string="Many have been taken by bus to reception centres but some say they plan to walk to neighbouring Slovenia?";&lt;BR /&gt;&amp;nbsp; output;&lt;BR /&gt;run;&lt;/P&gt;&lt;P&gt;data want;&lt;BR /&gt;&amp;nbsp; set have;&lt;BR /&gt;&amp;nbsp; length sentance $2000;&lt;BR /&gt;&amp;nbsp; s_ord=1;&lt;BR /&gt;&amp;nbsp; do i=1 to length(string);&lt;BR /&gt;&amp;nbsp; &amp;nbsp; if substr(string,i,1) in ('.','?','!') or i=length(string) then do;&lt;BR /&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; substr(sentance,s_ord,1)=substr(string,i,1);&lt;BR /&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; output;&lt;BR /&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; sentance="";&lt;BR /&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; s_ord=1;&lt;BR /&gt;&amp;nbsp; &amp;nbsp; end;&lt;BR /&gt;&amp;nbsp; &amp;nbsp; else do;&lt;BR /&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; substr(sentance,s_ord,1)=substr(string,i,1);&lt;BR /&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; s_ord=s_ord+1;&lt;BR /&gt;&amp;nbsp; &amp;nbsp; end;&lt;BR /&gt;&amp;nbsp; end;&lt;BR /&gt;run;&lt;/P&gt;</description>
      <pubDate>Fri, 18 Sep 2015 10:20:51 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Procedures/Splitting-a-paragraph-into-an-array-of-sentences/m-p/226226#M54015</guid>
      <dc:creator>RW9</dc:creator>
      <dc:date>2015-09-18T10:20:51Z</dc:date>
    </item>
    <item>
      <title>Re: Splitting a paragraph into an array of sentences</title>
      <link>https://communities.sas.com/t5/SAS-Procedures/Splitting-a-paragraph-into-an-array-of-sentences/m-p/226245#M54016</link>
      <description>Thank you - that's very helpful. Unfortunately, the data is too big to take out of SAS. But how could you create an array? So that instead of having 7 rows in the outpuf file, you still have the 3 rows, but multiple variables in an array string1-stringn? Thanks again</description>
      <pubDate>Fri, 18 Sep 2015 11:58:06 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Procedures/Splitting-a-paragraph-into-an-array-of-sentences/m-p/226245#M54016</guid>
      <dc:creator>Nadz</dc:creator>
      <dc:date>2015-09-18T11:58:06Z</dc:date>
    </item>
    <item>
      <title>Re: Splitting a paragraph into an array of sentences</title>
      <link>https://communities.sas.com/t5/SAS-Procedures/Splitting-a-paragraph-into-an-array-of-sentences/m-p/226253#M54017</link>
      <description>&lt;P&gt;I would think you could use the SCAN function, where delimiters are period, exclamation and question mark, to extract each sentence one by one. Of course, as noted, this fails if there is a period within a sentence, like "Mr. Jones came home."&lt;/P&gt;</description>
      <pubDate>Fri, 18 Sep 2015 12:31:39 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Procedures/Splitting-a-paragraph-into-an-array-of-sentences/m-p/226253#M54017</guid>
      <dc:creator>PaigeMiller</dc:creator>
      <dc:date>2015-09-18T12:31:39Z</dc:date>
    </item>
    <item>
      <title>Re: Splitting a paragraph into an array of sentences</title>
      <link>https://communities.sas.com/t5/SAS-Procedures/Splitting-a-paragraph-into-an-array-of-sentences/m-p/226259#M54018</link>
      <description>&lt;P&gt;Indeed, you raise a good point on scan(), it can reduce the code somewhat:&lt;/P&gt;&lt;P&gt;data want;&lt;BR /&gt;&amp;nbsp; set have;&lt;BR /&gt;&amp;nbsp; i=1;&lt;BR /&gt;&amp;nbsp; do while(scan(string,i,".?!") ne "");&lt;BR /&gt;&amp;nbsp; &amp;nbsp; sentance=scan(string,i,".?!");&lt;BR /&gt;&amp;nbsp; &amp;nbsp; output;&lt;BR /&gt;&amp;nbsp; &amp;nbsp; i=i+1;&lt;BR /&gt;&amp;nbsp; end;&lt;BR /&gt;run;&lt;/P&gt;</description>
      <pubDate>Fri, 18 Sep 2015 13:00:53 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Procedures/Splitting-a-paragraph-into-an-array-of-sentences/m-p/226259#M54018</guid>
      <dc:creator>RW9</dc:creator>
      <dc:date>2015-09-18T13:00:53Z</dc:date>
    </item>
    <item>
      <title>Re: Splitting a paragraph into an array of sentences</title>
      <link>https://communities.sas.com/t5/SAS-Procedures/Splitting-a-paragraph-into-an-array-of-sentences/m-p/226270#M54019</link>
      <description>&lt;P&gt;You'd need to add a LENGTH statement before you being the loop&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;LENGTH SENTENCE $ 1024;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;or some other length.&lt;/P&gt;</description>
      <pubDate>Fri, 18 Sep 2015 13:40:35 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Procedures/Splitting-a-paragraph-into-an-array-of-sentences/m-p/226270#M54019</guid>
      <dc:creator>PaigeMiller</dc:creator>
      <dc:date>2015-09-18T13:40:35Z</dc:date>
    </item>
    <item>
      <title>Re: Splitting a paragraph into an array of sentences</title>
      <link>https://communities.sas.com/t5/SAS-Procedures/Splitting-a-paragraph-into-an-array-of-sentences/m-p/226520#M54046</link>
      <description>Thank you both. A SAS consultant also offered me this which worked - pretty awesome! data have; string="Officials said they had no choice after more than 13,000 people entered the country since Hungary fenced off its border with Serbia earlier this week. Many have been taken by bus to reception centres but some say they plan to walk to neighbouring Slovenia? Huge numbers of people heading north from the Mediterranean have created a political crisis in the European Union? Croatian officials said roads leading to the border crossings had also been shut! The crossing on the main road linking Belgrade and Zagreb - at Bajakovo - appeared to be the only one left open"; output; string="Officials said they had no choice after more than 13,000 people entered the country since Hungary fenced off its border with Serbia earlier this week."; output; string="Many have been taken by bus to reception centres but some say they plan to walk to neighbouring Slovenia?"; output; run; data sentences; set have; array arr_sentence {6} $ 200 sentence_1-sentence_6; do n = 1 to 6; arr_sentence{n}=scan(string,n,'.?!'); end; drop n; run;</description>
      <pubDate>Mon, 21 Sep 2015 08:51:39 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Procedures/Splitting-a-paragraph-into-an-array-of-sentences/m-p/226520#M54046</guid>
      <dc:creator>Nadz</dc:creator>
      <dc:date>2015-09-21T08:51:39Z</dc:date>
    </item>
    <item>
      <title>Re: Splitting a paragraph into an array of sentences</title>
      <link>https://communities.sas.com/t5/SAS-Procedures/Splitting-a-paragraph-into-an-array-of-sentences/m-p/226522#M54047</link>
      <description>&lt;P&gt;Hi,&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Yes, that is exactly the same as I posted. &amp;nbsp;Firstly the code is unreadable, need to put things on different lines, use indetation. &amp;nbsp;There are two main differences, firstly, he has assumed that there will be no more than 6 strings per sentance. &amp;nbsp;This may or may not work. &amp;nbsp;Secondly, he uses an array of blocks of text up to a maximum of 6. This may be fine, but I tend to prefer a normalised strcuture (data goes down rather than across) as the code is simplified.&lt;/P&gt;</description>
      <pubDate>Mon, 21 Sep 2015 08:56:26 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Procedures/Splitting-a-paragraph-into-an-array-of-sentences/m-p/226522#M54047</guid>
      <dc:creator>RW9</dc:creator>
      <dc:date>2015-09-21T08:56:26Z</dc:date>
    </item>
  </channel>
</rss>

