<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: SAS Text Miner in Enterprise Miner: Derived text variables in SAS Academy for Data Science</title>
    <link>https://communities.sas.com/t5/SAS-Academy-for-Data-Science/SAS-Text-Miner-in-Enterprise-Miner-Derived-text-variables/m-p/621724#M528</link>
    <description>&lt;P&gt;SAS Text Miner only sees Text variables and Target variables (variables with roles Text or Target). Target variables are only seen if they have a level of binary or nominal. If there are two or more Text variables in a SAS data set, the Text Parsing node selects exactly one of the Text variables for analysis and ignores all of the rest. It has no way of knowing how any of the Text variables were created, whether concatenated or filtered or anything else. If there are two or more Text variables, the Text Parsing node uses the following selection rules:&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;1. Pick the Text variable with the greatest length.&lt;/P&gt;
&lt;P&gt;2. If two Text variables tie for having the greatest length, pick the one that comes first in sort order. (Example: variable Animals has length 272, and variable Vegetables has length 272, choose Animals because it appears first in sort order by name (A comes before V).&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;As a best practice, never let the Text Parsing node choose for you. Set the Use status of all Text variables to No except for the one that YOU choose to include in the analysis.&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;If you want to concatenate two or more Text variables, use a SAS Code node. Example code:&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;data &amp;amp;EM_EXPORT_TRAIN;&lt;BR /&gt;&amp;nbsp; &amp;nbsp;set &amp;amp;EM_IMPORT_DATA;&lt;/P&gt;
&lt;P&gt;&amp;nbsp; &amp;nbsp;attrib NewText length=$242; /*Assume Text1-Text3 have length 80*/&lt;BR /&gt;&amp;nbsp; &amp;nbsp;NewText=catx(' ',Text1,Text2,Text3);&lt;BR /&gt;run;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;The ATTRIB statement is necessary to prevent truncation of the resulting concatenation. Without the ATTRIB statement, NewText would be truncated to 200 characters.&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;You can attach a Text Parsing node to the SAS Code node and do the analysis using the concatenated variable.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;I hope this helps.&lt;/P&gt;</description>
    <pubDate>Sun, 02 Feb 2020 02:58:08 GMT</pubDate>
    <dc:creator>TWoodfield</dc:creator>
    <dc:date>2020-02-02T02:58:08Z</dc:date>
    <item>
      <title>SAS Text Miner in Enterprise Miner: Derived text variables</title>
      <link>https://communities.sas.com/t5/SAS-Academy-for-Data-Science/SAS-Text-Miner-in-Enterprise-Miner-Derived-text-variables/m-p/621554#M527</link>
      <description>If you concatenate multiple text variables and make a new derived variable comprised of the others how does the subsequent text mining process this? Does it use prior text vars plus the derived text variable?&lt;BR /&gt;&lt;BR /&gt;If one were to do this what text mining node accomplishes a task like this?&lt;BR /&gt;</description>
      <pubDate>Fri, 31 Jan 2020 21:30:39 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Academy-for-Data-Science/SAS-Text-Miner-in-Enterprise-Miner-Derived-text-variables/m-p/621554#M527</guid>
      <dc:creator>eddieray01</dc:creator>
      <dc:date>2020-01-31T21:30:39Z</dc:date>
    </item>
    <item>
      <title>Re: SAS Text Miner in Enterprise Miner: Derived text variables</title>
      <link>https://communities.sas.com/t5/SAS-Academy-for-Data-Science/SAS-Text-Miner-in-Enterprise-Miner-Derived-text-variables/m-p/621724#M528</link>
      <description>&lt;P&gt;SAS Text Miner only sees Text variables and Target variables (variables with roles Text or Target). Target variables are only seen if they have a level of binary or nominal. If there are two or more Text variables in a SAS data set, the Text Parsing node selects exactly one of the Text variables for analysis and ignores all of the rest. It has no way of knowing how any of the Text variables were created, whether concatenated or filtered or anything else. If there are two or more Text variables, the Text Parsing node uses the following selection rules:&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;1. Pick the Text variable with the greatest length.&lt;/P&gt;
&lt;P&gt;2. If two Text variables tie for having the greatest length, pick the one that comes first in sort order. (Example: variable Animals has length 272, and variable Vegetables has length 272, choose Animals because it appears first in sort order by name (A comes before V).&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;As a best practice, never let the Text Parsing node choose for you. Set the Use status of all Text variables to No except for the one that YOU choose to include in the analysis.&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;If you want to concatenate two or more Text variables, use a SAS Code node. Example code:&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;data &amp;amp;EM_EXPORT_TRAIN;&lt;BR /&gt;&amp;nbsp; &amp;nbsp;set &amp;amp;EM_IMPORT_DATA;&lt;/P&gt;
&lt;P&gt;&amp;nbsp; &amp;nbsp;attrib NewText length=$242; /*Assume Text1-Text3 have length 80*/&lt;BR /&gt;&amp;nbsp; &amp;nbsp;NewText=catx(' ',Text1,Text2,Text3);&lt;BR /&gt;run;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;The ATTRIB statement is necessary to prevent truncation of the resulting concatenation. Without the ATTRIB statement, NewText would be truncated to 200 characters.&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;You can attach a Text Parsing node to the SAS Code node and do the analysis using the concatenated variable.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;I hope this helps.&lt;/P&gt;</description>
      <pubDate>Sun, 02 Feb 2020 02:58:08 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Academy-for-Data-Science/SAS-Text-Miner-in-Enterprise-Miner-Derived-text-variables/m-p/621724#M528</guid>
      <dc:creator>TWoodfield</dc:creator>
      <dc:date>2020-02-02T02:58:08Z</dc:date>
    </item>
  </channel>
</rss>

