<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: Sentiment Analysis Workbench Corpus format in SAS Data Science</title>
    <link>https://communities.sas.com/t5/SAS-Data-Science/Sentiment-Analysis-Workbench-Corpus-format/m-p/115462#M9289</link>
    <description>&lt;HTML&gt;&lt;HEAD&gt;&lt;/HEAD&gt;&lt;BODY&gt;&lt;P&gt;I just solved my own question just now.&amp;nbsp; Will it count to mark this as the right answer?&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;I went into the directory where SAS SA Workbench is installed.&amp;nbsp; There is a "test_documents" folder with an example corpus.&amp;nbsp; It looks like the corpus needs to be a zipped folder of XML files.&amp;nbsp; Each document has the following format:&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;&amp;lt;doc&amp;gt;&lt;/P&gt;&lt;P&gt;&amp;lt;docid&amp;gt;&amp;lt;![CDATA[filename .xml without extension]]&amp;gt;&amp;lt;/docid&amp;gt;&lt;/P&gt;&lt;P&gt;&amp;lt;title&amp;gt;&amp;lt;![CDATA[subject title here]]&amp;gt;&amp;lt;/title&amp;gt;&lt;/P&gt;&lt;P&gt;&amp;lt;createtime&amp;gt;&amp;lt;![CDATA[10/6/2008 10:00:00 AM]]&amp;gt;&amp;lt;/createtime&amp;gt;&lt;/P&gt;&lt;P&gt;&amp;lt;body&amp;gt;&amp;lt;![CDATA[blah blah blah yadda yadda yadda text text text]]&amp;gt;&amp;lt;/body&amp;gt;&lt;/P&gt;&lt;P&gt;&amp;lt;/doc&amp;gt;&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;What sucks is that the SAS sentiment tools don't appear to build my corpus for me (unless I am missing something?).&amp;nbsp; Instead, I have to joys of converting all of my text files into xml files with this format.&amp;nbsp; &lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;I did manually change 5 of my .txt to .xml with the above xml structure.&amp;nbsp; I was able to upload this successfully.&lt;/P&gt;&lt;/BODY&gt;&lt;/HTML&gt;</description>
    <pubDate>Mon, 19 Aug 2013 22:30:32 GMT</pubDate>
    <dc:creator>jaredp</dc:creator>
    <dc:date>2013-08-19T22:30:32Z</dc:date>
    <item>
      <title>Sentiment Analysis Workbench Corpus format</title>
      <link>https://communities.sas.com/t5/SAS-Data-Science/Sentiment-Analysis-Workbench-Corpus-format/m-p/115461#M9288</link>
      <description>&lt;HTML&gt;&lt;HEAD&gt;&lt;/HEAD&gt;&lt;BODY&gt;&lt;P&gt;I've installed the various Sentiment Analysis tools (studio, server and workbench).&amp;nbsp; I've already created my training corpus and created a Statistical Model in studio.&amp;nbsp; I've uploaded the model to the server.&amp;nbsp; &lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;I am now creating a new project in Workbench.&amp;nbsp; There is a tab where I specify my corpus and upload it.&amp;nbsp; The upload fails every time with the error "Unable to upload file".&amp;nbsp; &lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;The file I am uploading is a zipped folder of text files. Here are my guesses as to what may be happening:&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;1) the file is being uploaded to a folder which I (i.e. the web server or workbench user) may not have permissions to access.&amp;nbsp; But what folder would that be?&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;2) perhaps the folder is not uploaded, but the contents read and placed into the MySQL database?&amp;nbsp; &lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;3) the file format is incorrect.&amp;nbsp; I also tried zipping only the text documents.&amp;nbsp; That did not work.&amp;nbsp; Perhaps the formats of the files themselves are not acceptable.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;I have no clue how to proceed.&amp;nbsp; Any suggestions are appreciated.&lt;/P&gt;&lt;/BODY&gt;&lt;/HTML&gt;</description>
      <pubDate>Mon, 19 Aug 2013 22:09:45 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Data-Science/Sentiment-Analysis-Workbench-Corpus-format/m-p/115461#M9288</guid>
      <dc:creator>jaredp</dc:creator>
      <dc:date>2013-08-19T22:09:45Z</dc:date>
    </item>
    <item>
      <title>Re: Sentiment Analysis Workbench Corpus format</title>
      <link>https://communities.sas.com/t5/SAS-Data-Science/Sentiment-Analysis-Workbench-Corpus-format/m-p/115462#M9289</link>
      <description>&lt;HTML&gt;&lt;HEAD&gt;&lt;/HEAD&gt;&lt;BODY&gt;&lt;P&gt;I just solved my own question just now.&amp;nbsp; Will it count to mark this as the right answer?&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;I went into the directory where SAS SA Workbench is installed.&amp;nbsp; There is a "test_documents" folder with an example corpus.&amp;nbsp; It looks like the corpus needs to be a zipped folder of XML files.&amp;nbsp; Each document has the following format:&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;&amp;lt;doc&amp;gt;&lt;/P&gt;&lt;P&gt;&amp;lt;docid&amp;gt;&amp;lt;![CDATA[filename .xml without extension]]&amp;gt;&amp;lt;/docid&amp;gt;&lt;/P&gt;&lt;P&gt;&amp;lt;title&amp;gt;&amp;lt;![CDATA[subject title here]]&amp;gt;&amp;lt;/title&amp;gt;&lt;/P&gt;&lt;P&gt;&amp;lt;createtime&amp;gt;&amp;lt;![CDATA[10/6/2008 10:00:00 AM]]&amp;gt;&amp;lt;/createtime&amp;gt;&lt;/P&gt;&lt;P&gt;&amp;lt;body&amp;gt;&amp;lt;![CDATA[blah blah blah yadda yadda yadda text text text]]&amp;gt;&amp;lt;/body&amp;gt;&lt;/P&gt;&lt;P&gt;&amp;lt;/doc&amp;gt;&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;What sucks is that the SAS sentiment tools don't appear to build my corpus for me (unless I am missing something?).&amp;nbsp; Instead, I have to joys of converting all of my text files into xml files with this format.&amp;nbsp; &lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;I did manually change 5 of my .txt to .xml with the above xml structure.&amp;nbsp; I was able to upload this successfully.&lt;/P&gt;&lt;/BODY&gt;&lt;/HTML&gt;</description>
      <pubDate>Mon, 19 Aug 2013 22:30:32 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Data-Science/Sentiment-Analysis-Workbench-Corpus-format/m-p/115462#M9289</guid>
      <dc:creator>jaredp</dc:creator>
      <dc:date>2013-08-19T22:30:32Z</dc:date>
    </item>
  </channel>
</rss>

