<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: Proc Hptmine: What is the formula behind assigning a topic to a document in SAS Data Science</title>
    <link>https://communities.sas.com/t5/SAS-Data-Science/Proc-Hptmine-What-is-the-formula-behind-assigning-a-topic-to-a/m-p/609689#M10065</link>
    <description>&lt;P&gt;Here is a quick summary:&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&lt;BR /&gt;For the U factor, it is number-of-terms by number-of-topics, calculate the mean and std deviation per column (topic) of the absolute value of each entry. I believe the default cutoff is 1 standard deviation above the mean. Set every value in abs value below that cutoff to zero. Now reform the document projections from your updated U. Now, repeat the procedure on that result as this time you will be doing it to documents.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Russ&lt;/P&gt;</description>
    <pubDate>Thu, 05 Dec 2019 14:40:07 GMT</pubDate>
    <dc:creator>RussAlbright</dc:creator>
    <dc:date>2019-12-05T14:40:07Z</dc:date>
    <item>
      <title>Proc Hptmine: What is the formula behind assigning a topic to a document</title>
      <link>https://communities.sas.com/t5/SAS-Data-Science/Proc-Hptmine-What-is-the-formula-behind-assigning-a-topic-to-a/m-p/609577#M10064</link>
      <description>&lt;P&gt;I am using proc hptmine. It generates some documents such as SVD matrices, docrpo, terms, parent, topics... etc.&lt;/P&gt;&lt;P&gt;I joined some of these tables to find the topic assigned for each document. Using term cutoff rate.&lt;/P&gt;&lt;P&gt;However I did not get the same results as the text miner does in E-miner.&lt;/P&gt;&lt;P&gt;Can anyone tell me how can assign a document to a particular text topic. there must be a formula using thresholds to do so.&lt;/P&gt;&lt;P&gt;Please help.&lt;/P&gt;&lt;P&gt;Thanks&lt;/P&gt;</description>
      <pubDate>Thu, 05 Dec 2019 00:30:12 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Data-Science/Proc-Hptmine-What-is-the-formula-behind-assigning-a-topic-to-a/m-p/609577#M10064</guid>
      <dc:creator>eserates</dc:creator>
      <dc:date>2019-12-05T00:30:12Z</dc:date>
    </item>
    <item>
      <title>Re: Proc Hptmine: What is the formula behind assigning a topic to a document</title>
      <link>https://communities.sas.com/t5/SAS-Data-Science/Proc-Hptmine-What-is-the-formula-behind-assigning-a-topic-to-a/m-p/609689#M10065</link>
      <description>&lt;P&gt;Here is a quick summary:&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&lt;BR /&gt;For the U factor, it is number-of-terms by number-of-topics, calculate the mean and std deviation per column (topic) of the absolute value of each entry. I believe the default cutoff is 1 standard deviation above the mean. Set every value in abs value below that cutoff to zero. Now reform the document projections from your updated U. Now, repeat the procedure on that result as this time you will be doing it to documents.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Russ&lt;/P&gt;</description>
      <pubDate>Thu, 05 Dec 2019 14:40:07 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Data-Science/Proc-Hptmine-What-is-the-formula-behind-assigning-a-topic-to-a/m-p/609689#M10065</guid>
      <dc:creator>RussAlbright</dc:creator>
      <dc:date>2019-12-05T14:40:07Z</dc:date>
    </item>
    <item>
      <title>Re: Proc Hptmine: What is the formula behind assigning a topic to a document</title>
      <link>https://communities.sas.com/t5/SAS-Data-Science/Proc-Hptmine-What-is-the-formula-behind-assigning-a-topic-to-a/m-p/609749#M10066</link>
      <description>&lt;P&gt;thank you. I will try this and let you know if this works.&lt;/P&gt;</description>
      <pubDate>Thu, 05 Dec 2019 17:49:00 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Data-Science/Proc-Hptmine-What-is-the-formula-behind-assigning-a-topic-to-a/m-p/609749#M10066</guid>
      <dc:creator>eserates</dc:creator>
      <dc:date>2019-12-05T17:49:00Z</dc:date>
    </item>
    <item>
      <title>Re: Proc Hptmine: What is the formula behind assigning a topic to a document</title>
      <link>https://communities.sas.com/t5/SAS-Data-Science/Proc-Hptmine-What-is-the-formula-behind-assigning-a-topic-to-a/m-p/610841#M10067</link>
      <description>&lt;P&gt;hi again, thanks for your last response. it definitely help with progress. Although this is a good step towards finding topics assigned to documents; I still cannot match the same topics assigned by text miner in E-miner.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;instead of using U matrix for the calculations you mentioned above I used the &lt;STRONG&gt;DOCPRO&lt;/STRONG&gt; output from HPTMINE. Since this U matrix is the&amp;nbsp;projection of terms onto documents. You think that was ok to use it then?&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Second question is TOPICS dataset have &lt;STRONG&gt;termcutoff&lt;/STRONG&gt; rates in the list. Can I use those rates&amp;nbsp;in conjunction with V matrix whether those rates are above the rates in the V matrix? Or those cutoff rates need to be compared to some other values?&lt;/P&gt;&lt;P&gt;thanks for your help.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Tue, 10 Dec 2019 21:25:02 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Data-Science/Proc-Hptmine-What-is-the-formula-behind-assigning-a-topic-to-a/m-p/610841#M10067</guid>
      <dc:creator>eserates</dc:creator>
      <dc:date>2019-12-10T21:25:02Z</dc:date>
    </item>
    <item>
      <title>Re: Proc Hptmine: What is the formula behind assigning a topic to a document</title>
      <link>https://communities.sas.com/t5/SAS-Data-Science/Proc-Hptmine-What-is-the-formula-behind-assigning-a-topic-to-a/m-p/610894#M10068</link>
      <description>&lt;P&gt;You have to truncate the U matrix using the technique i described then reform the docpro data set. PROC HPTMINE does not do this.&amp;nbsp;The process has quite a few steps and may be a challenge to re-implement. Have you considered just saving out the sas code from your flow? Depending on what your trying to accomplish, this code will allow you to submit the whole flow programatically.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;You would apply the termcutoffs to U, not V. U is number terms by number of topics. Then you reform docpro and then apply docutffs to docpro.&lt;/P&gt;</description>
      <pubDate>Wed, 11 Dec 2019 04:05:38 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Data-Science/Proc-Hptmine-What-is-the-formula-behind-assigning-a-topic-to-a/m-p/610894#M10068</guid>
      <dc:creator>RussAlbright</dc:creator>
      <dc:date>2019-12-11T04:05:38Z</dc:date>
    </item>
    <item>
      <title>Re: Proc Hptmine: What is the formula behind assigning a topic to a document</title>
      <link>https://communities.sas.com/t5/SAS-Data-Science/Proc-Hptmine-What-is-the-formula-behind-assigning-a-topic-to-a/m-p/611143#M10069</link>
      <description>&lt;P&gt;hi,&amp;nbsp;&lt;/P&gt;&lt;P&gt;last time you mentioned mprint option but that did not give me much idea about the code used by miner. there were many macros called. Are you talking about using the SAS Code node in the&amp;nbsp; e-miner that needs to be connected text topic node?&lt;/P&gt;&lt;P&gt;I have not tried that before. Can you please tell me how or where to find instructions on saving the flow code?&lt;/P&gt;&lt;P&gt;thanks&lt;/P&gt;</description>
      <pubDate>Wed, 11 Dec 2019 20:39:50 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Data-Science/Proc-Hptmine-What-is-the-formula-behind-assigning-a-topic-to-a/m-p/611143#M10069</guid>
      <dc:creator>eserates</dc:creator>
      <dc:date>2019-12-11T20:39:50Z</dc:date>
    </item>
    <item>
      <title>Re: Proc Hptmine: What is the formula behind assigning a topic to a document</title>
      <link>https://communities.sas.com/t5/SAS-Data-Science/Proc-Hptmine-What-is-the-formula-behind-assigning-a-topic-to-a/m-p/611316#M10070</link>
      <description>&lt;P&gt;I am talking about built in macros that are called to do parts of the computation that the procedure does not do. If you right click on a node in your flow and choose "Export path to sas code" you can save the code that is run when your flow runs. If you look at that code you will see the names of these macros. Also, if you add&lt;/P&gt;
&lt;P&gt;&amp;nbsp; &amp;nbsp;options mprint;&lt;/P&gt;
&lt;P&gt;when you run the path, you will see a printout of many of the macros executing. The actual source of these&amp;nbsp; macros&amp;nbsp; is not visible otherwise.&lt;/P&gt;
&lt;P&gt;Russ&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Thu, 12 Dec 2019 14:30:16 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Data-Science/Proc-Hptmine-What-is-the-formula-behind-assigning-a-topic-to-a/m-p/611316#M10070</guid>
      <dc:creator>RussAlbright</dc:creator>
      <dc:date>2019-12-12T14:30:16Z</dc:date>
    </item>
  </channel>
</rss>

