<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: VTA bulk PDF/Docx analysis in SAS Data Science</title>
    <link>https://communities.sas.com/t5/SAS-Data-Science/VTA-bulk-PDF-Docx-analysis/m-p/581012#M10033</link>
    <description>Can anybody help me with this issue, please?&lt;BR /&gt;Thank you very much.</description>
    <pubDate>Wed, 14 Aug 2019 08:05:35 GMT</pubDate>
    <dc:creator>carlosGoetz</dc:creator>
    <dc:date>2019-08-14T08:05:35Z</dc:date>
    <item>
      <title>VTA bulk PDF/Docx analysis</title>
      <link>https://communities.sas.com/t5/SAS-Data-Science/VTA-bulk-PDF-Docx-analysis/m-p/578661#M10030</link>
      <description>&lt;P&gt;Good morning everybody.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;I need to analyze a huge number of legal documents in order to find out which ones have certain clauses and which ones don't. I'd like to know how to proceed. I'm using Visual Text Analytics on SAS Viya 3.4, but it seems to me that it's impossible to do something like that.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Can you help me with this issue, please?&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Thank you very much!&lt;/P&gt;</description>
      <pubDate>Fri, 02 Aug 2019 11:29:13 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Data-Science/VTA-bulk-PDF-Docx-analysis/m-p/578661#M10030</guid>
      <dc:creator>carlosGoetz</dc:creator>
      <dc:date>2019-08-02T11:29:13Z</dc:date>
    </item>
    <item>
      <title>Re: VTA bulk PDF/Docx analysis</title>
      <link>https://communities.sas.com/t5/SAS-Data-Science/VTA-bulk-PDF-Docx-analysis/m-p/579792#M10031</link>
      <description>&lt;P&gt;Hello Carlos -&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;you can import many PDF files into Viya to use in VTA using the data import function:&lt;/P&gt;
&lt;P&gt;&lt;A href="https://go.documentation.sas.com/?docsetId=datahub&amp;amp;docsetTarget=p1sv89vo4n8f03n0zvq0k90i8g3t.htm&amp;amp;docsetVersion=2.2&amp;amp;locale=en#p0djykgy2p8o16n11ceibyt2lk2o"&gt;https://go.documentation.sas.com/?docsetId=datahub&amp;amp;docsetTarget=p1sv89vo4n8f03n0zvq0k90i8g3t.htm&amp;amp;docsetVersion=2.2&amp;amp;locale=en#p0djykgy2p8o16n11ceibyt2lk2o&lt;/A&gt;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;in VTA, you can define rules to categorize documents that include certain clauses you require:&lt;/P&gt;
&lt;P&gt;&lt;A href="https://go.documentation.sas.com/?activeCdc=ctxtcdc&amp;amp;cdcId=capcdc&amp;amp;cdcVersion=8.4&amp;amp;docsetId=ctxtug&amp;amp;docsetTarget=p0ardt2s3i6myvn1ny1c3h5n7iiv.htm&amp;amp;locale=en&amp;amp;docsetVersion=8.4#n0c4hjww14gcnin1txvt0t2ebqfz"&gt;https://go.documentation.sas.com/?activeCdc=ctxtcdc&amp;amp;cdcId=capcdc&amp;amp;cdcVersion=8.4&amp;amp;docsetId=ctxtug&amp;amp;docsetTarget=p0ardt2s3i6myvn1ny1c3h5n7iiv.htm&amp;amp;locale=en&amp;amp;docsetVersion=8.4#n0c4hjww14gcnin1txvt0t2ebqfz&lt;/A&gt;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;in VTA, it tests your model on the PDF files, but you can also apply the model onto new data / scoring process here:&lt;/P&gt;
&lt;P&gt;&lt;A href="https://go.documentation.sas.com/?activeCdc=ctxtcdc&amp;amp;cdcId=capcdc&amp;amp;cdcVersion=8.4&amp;amp;docsetId=ctxtug&amp;amp;docsetTarget=n0jaksb16blajpn14qmqw67snm7t.htm&amp;amp;locale=en&amp;amp;docsetVersion=8.4"&gt;https://go.documentation.sas.com/?activeCdc=ctxtcdc&amp;amp;cdcId=capcdc&amp;amp;cdcVersion=8.4&amp;amp;docsetId=ctxtug&amp;amp;docsetTarget=n0jaksb16blajpn14qmqw67snm7t.htm&amp;amp;locale=en&amp;amp;docsetVersion=8.4&lt;/A&gt;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;hope it helps!&lt;/P&gt;</description>
      <pubDate>Thu, 08 Aug 2019 05:19:05 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Data-Science/VTA-bulk-PDF-Docx-analysis/m-p/579792#M10031</guid>
      <dc:creator>Jason7</dc:creator>
      <dc:date>2019-08-08T05:19:05Z</dc:date>
    </item>
    <item>
      <title>Re: VTA bulk PDF/Docx analysis</title>
      <link>https://communities.sas.com/t5/SAS-Data-Science/VTA-bulk-PDF-Docx-analysis/m-p/580083#M10032</link>
      <description>&lt;P&gt;Thank you very much.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;I have another question: If I only have to check if a bunch of documents have or don't have the word "Wexner", can I just create a pipeline with just the two nodes: Data and Categories?&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;I've created a category Bueno that says (NOT,("Wexner")), but when I run the node I obtain the next error message:&lt;/P&gt;&lt;P&gt;&lt;SPAN&gt;Se ha producido un error mientras se ejecutaba el pipeline. Consulte los registros del nodo para más detalles.&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&lt;SPAN&gt;... and the log says:&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;Exception occurred while querying categories table: category document table with the specified taxonomyId not found: 4a7f7f286c4c6558016c751433ff0004&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Can you tell me what I'm doing wrong?&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Also, you can find attached an image about matches on a document. Can you tell me why if there are 3 out of 4 documents that contains the word Anova as listed in lower part of the screen, at the right I see 0 matches? What does it mean?&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Thank you very much!&lt;/P&gt;&lt;P&gt;Best regards,&lt;/P&gt;&lt;P&gt;Carlos&lt;/P&gt;</description>
      <pubDate>Fri, 16 Aug 2019 06:34:09 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Data-Science/VTA-bulk-PDF-Docx-analysis/m-p/580083#M10032</guid>
      <dc:creator>carlosGoetz</dc:creator>
      <dc:date>2019-08-16T06:34:09Z</dc:date>
    </item>
    <item>
      <title>Re: VTA bulk PDF/Docx analysis</title>
      <link>https://communities.sas.com/t5/SAS-Data-Science/VTA-bulk-PDF-Docx-analysis/m-p/581012#M10033</link>
      <description>Can anybody help me with this issue, please?&lt;BR /&gt;Thank you very much.</description>
      <pubDate>Wed, 14 Aug 2019 08:05:35 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Data-Science/VTA-bulk-PDF-Docx-analysis/m-p/581012#M10033</guid>
      <dc:creator>carlosGoetz</dc:creator>
      <dc:date>2019-08-14T08:05:35Z</dc:date>
    </item>
  </channel>
</rss>

