<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: How to Standardize Text Values with Text Miner in SAS Enterprise Miner? in SAS Data Science</title>
    <link>https://communities.sas.com/t5/SAS-Data-Science/How-to-Standardize-Text-Values-with-Text-Miner-in-SAS-Enterprise/m-p/572744#M10028</link>
    <description>&lt;P&gt;Hi,&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;This sounds a little more like a data cleansing-fuzzy matching type of task. Take a look at something like this for sas functions and programs to help you standardize the input.&amp;nbsp;&lt;A href="https://www.lexjansen.com/sesug/2018/SESUG2018_Paper-143_Final_PDF.pdf" target="_blank"&gt;https://www.lexjansen.com/sesug/2018/SESUG2018_Paper-143_Final_PDF.pdf&lt;/A&gt;&lt;BR /&gt;&lt;BR /&gt;Text Miner is based on how terms tend to cooccur together within documents. The learning occurs across the collection based on how these patterns of cooccurrences exist. In your example, where you mostly have a single term per document, there is no cooccurrence going on and so Text Miner is not the best tool for this kind of task.&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
    <pubDate>Thu, 11 Jul 2019 14:09:55 GMT</pubDate>
    <dc:creator>RussAlbright</dc:creator>
    <dc:date>2019-07-11T14:09:55Z</dc:date>
    <item>
      <title>How to Standardize Text Values with Text Miner in SAS Enterprise Miner?</title>
      <link>https://communities.sas.com/t5/SAS-Data-Science/How-to-Standardize-Text-Values-with-Text-Miner-in-SAS-Enterprise/m-p/572573#M10027</link>
      <description>&lt;P&gt;Hello everybody,&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;I need to do some analysis with Text Miner in SAS Enterprise Miner. I have a Text variable which has a 500 character length and this variable has 30000 distinct values, I want to standartize this variable by excluding +/- signs or conjuctions like “or”, “with” and etc. I also want to transform this values from complicated value to pure value, let’s pretend that I have a value as below;&lt;/P&gt;&lt;P&gt;&lt;BR /&gt;Papers *?&lt;/P&gt;&lt;P&gt;PAPER&lt;/P&gt;&lt;P&gt;paper&lt;/P&gt;&lt;P&gt;pAPER&lt;/P&gt;&lt;P&gt;Paper&lt;/P&gt;&lt;P&gt;pape&lt;/P&gt;&lt;P&gt;PAPE&lt;/P&gt;&lt;P&gt;Pape&lt;/P&gt;&lt;P&gt;Paper with&lt;/P&gt;&lt;P&gt;Paper or book&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;I want to see the above values only Paper, how I can do it with Text Miner? Can somebody hepl me to resolve this, please&lt;/P&gt;&lt;P&gt;Thanks&lt;/P&gt;</description>
      <pubDate>Thu, 11 Jul 2019 00:46:46 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Data-Science/How-to-Standardize-Text-Values-with-Text-Miner-in-SAS-Enterprise/m-p/572573#M10027</guid>
      <dc:creator>ertr</dc:creator>
      <dc:date>2019-07-11T00:46:46Z</dc:date>
    </item>
    <item>
      <title>Re: How to Standardize Text Values with Text Miner in SAS Enterprise Miner?</title>
      <link>https://communities.sas.com/t5/SAS-Data-Science/How-to-Standardize-Text-Values-with-Text-Miner-in-SAS-Enterprise/m-p/572744#M10028</link>
      <description>&lt;P&gt;Hi,&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;This sounds a little more like a data cleansing-fuzzy matching type of task. Take a look at something like this for sas functions and programs to help you standardize the input.&amp;nbsp;&lt;A href="https://www.lexjansen.com/sesug/2018/SESUG2018_Paper-143_Final_PDF.pdf" target="_blank"&gt;https://www.lexjansen.com/sesug/2018/SESUG2018_Paper-143_Final_PDF.pdf&lt;/A&gt;&lt;BR /&gt;&lt;BR /&gt;Text Miner is based on how terms tend to cooccur together within documents. The learning occurs across the collection based on how these patterns of cooccurrences exist. In your example, where you mostly have a single term per document, there is no cooccurrence going on and so Text Miner is not the best tool for this kind of task.&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Thu, 11 Jul 2019 14:09:55 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Data-Science/How-to-Standardize-Text-Values-with-Text-Miner-in-SAS-Enterprise/m-p/572744#M10028</guid>
      <dc:creator>RussAlbright</dc:creator>
      <dc:date>2019-07-11T14:09:55Z</dc:date>
    </item>
  </channel>
</rss>

