<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: Concepts node in Text Analytics in SAS Data Science</title>
    <link>https://communities.sas.com/t5/SAS-Data-Science/Concepts-node-in-Text-Analytics/m-p/911963#M10709</link>
    <description>&lt;P&gt;In general try to avoid creating Concepts for 'free words', or other 'negatives', as this greatly reduces performance in LITI.&lt;BR /&gt;&lt;BR /&gt;I would suggest using a Concept rule to extract your stop/start words (Concept with 'Start_One', Concept with 'Start_Two', and a Concept with 'Stop_Word'), and then use some combination of 2 predicate rules to extract respectively 'Side_One' and 'Side_Two'.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;A predicate rule is a 'Fact Rule Type', and specifically designed to extract combinations of concepts with their context. You could for example use the following syntax to extract the first side:&lt;/P&gt;
&lt;P&gt;PREDICATE_RULE:(start_label,end_label):(SENT, "_start_label{Start_One}","_end_label{Start_Two}").&lt;/P&gt;
&lt;P&gt;SENT indicates that both {Start_One} and {Start_Two} occur in the same sentence.&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Alternatively, you can use SEQUENCE instead of PREDICATE_RULE. See also&amp;nbsp;&lt;A href="https://documentation.sas.com/doc/en/ctxtcdc/v_017/ctxtug/p1kf71w7npr9ecn1gysvovfs42x2.htm" target="_blank"&gt;https://documentation.sas.com/doc/en/ctxtcdc/v_017/ctxtug/p1kf71w7npr9ecn1gysvovfs42x2.htm&lt;/A&gt;&amp;nbsp;&lt;/P&gt;</description>
    <pubDate>Thu, 18 Jan 2024 12:40:34 GMT</pubDate>
    <dc:creator>PaulKoot</dc:creator>
    <dc:date>2024-01-18T12:40:34Z</dc:date>
    <item>
      <title>Concepts node in Text Analytics</title>
      <link>https://communities.sas.com/t5/SAS-Data-Science/Concepts-node-in-Text-Analytics/m-p/909691#M10682</link>
      <description>&lt;P&gt;Hey all!&lt;/P&gt;&lt;P&gt;I am using Visual Text Analytics for a project.&lt;/P&gt;&lt;P&gt;My textual data set containing many governments agreements.&lt;/P&gt;&lt;P&gt;in one of the concept, I am trying to extract the names of the 2 sides of an agreements.&lt;/P&gt;&lt;P&gt;during that, I need to extract all text from one specific word up to another, without knowing its length in adavanced.&lt;/P&gt;&lt;P&gt;for example, for the text: " side one is me, my friend and my dad, side two is all Mexican people stop_word" and the words "side one", "side two" I would expect extractioning "is me, my friend and my dad" and "is all Mexican people".&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;My current attitude is to define another concept 'free words' where I defined many rows with the structure: CONCEPT: _w _w... etc.&lt;/P&gt;&lt;P&gt;It seems to heavily affected my performance.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;does any one have any idea?&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Thanks!&lt;/P&gt;</description>
      <pubDate>Tue, 26 Dec 2023 12:32:39 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Data-Science/Concepts-node-in-Text-Analytics/m-p/909691#M10682</guid>
      <dc:creator>TzufRaifMia</dc:creator>
      <dc:date>2023-12-26T12:32:39Z</dc:date>
    </item>
    <item>
      <title>Re: Concepts node in Text Analytics</title>
      <link>https://communities.sas.com/t5/SAS-Data-Science/Concepts-node-in-Text-Analytics/m-p/911963#M10709</link>
      <description>&lt;P&gt;In general try to avoid creating Concepts for 'free words', or other 'negatives', as this greatly reduces performance in LITI.&lt;BR /&gt;&lt;BR /&gt;I would suggest using a Concept rule to extract your stop/start words (Concept with 'Start_One', Concept with 'Start_Two', and a Concept with 'Stop_Word'), and then use some combination of 2 predicate rules to extract respectively 'Side_One' and 'Side_Two'.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;A predicate rule is a 'Fact Rule Type', and specifically designed to extract combinations of concepts with their context. You could for example use the following syntax to extract the first side:&lt;/P&gt;
&lt;P&gt;PREDICATE_RULE:(start_label,end_label):(SENT, "_start_label{Start_One}","_end_label{Start_Two}").&lt;/P&gt;
&lt;P&gt;SENT indicates that both {Start_One} and {Start_Two} occur in the same sentence.&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Alternatively, you can use SEQUENCE instead of PREDICATE_RULE. See also&amp;nbsp;&lt;A href="https://documentation.sas.com/doc/en/ctxtcdc/v_017/ctxtug/p1kf71w7npr9ecn1gysvovfs42x2.htm" target="_blank"&gt;https://documentation.sas.com/doc/en/ctxtcdc/v_017/ctxtug/p1kf71w7npr9ecn1gysvovfs42x2.htm&lt;/A&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Thu, 18 Jan 2024 12:40:34 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Data-Science/Concepts-node-in-Text-Analytics/m-p/911963#M10709</guid>
      <dc:creator>PaulKoot</dc:creator>
      <dc:date>2024-01-18T12:40:34Z</dc:date>
    </item>
    <item>
      <title>Re: Concepts node in Text Analytics</title>
      <link>https://communities.sas.com/t5/SAS-Data-Science/Concepts-node-in-Text-Analytics/m-p/916804#M10723</link>
      <description>&lt;P&gt;Here's an example I have just made :&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;PREDICATE_RULE: (aa,bb): (SENT, "_aa{trial@}", "_bb{enroll@}")
PREDICATE_RULE: (xx,yy): (DIST_10, "_xx{trial@}", "_yy{enroll@}")&lt;/CODE&gt;&lt;/PRE&gt;
&lt;P&gt;All words will be extracted (concept match) as from trial (included) up to enroll (included).&lt;/P&gt;
&lt;P&gt;In the first rule trial and enroll should belong to the same sentence.&lt;/P&gt;
&lt;P&gt;In the second rule&amp;nbsp;trial and enroll should be within 10 (or fewer) words from each other (across multiple sentences).&lt;/P&gt;
&lt;P&gt;enroll&lt;STRONG&gt;@&lt;/STRONG&gt; means something like enroll&lt;STRONG&gt;ed&lt;/STRONG&gt; will also be accepted.&amp;nbsp;@ is a morphological expansion symbol here.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Order does not play a role. If you absolutely want trial@ to be first and enroll@ to be second, then you can use ORDDIST_10 instead of DIST_10. ORDDIST_n respects the order.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Koen&lt;/P&gt;</description>
      <pubDate>Mon, 19 Feb 2024 15:31:29 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Data-Science/Concepts-node-in-Text-Analytics/m-p/916804#M10723</guid>
      <dc:creator>sbxkoenk</dc:creator>
      <dc:date>2024-02-19T15:31:29Z</dc:date>
    </item>
  </channel>
</rss>

