<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: Why is the range of Term Density cut-off in SAS Contextual Analysis from 0.5 to 6? in SAS Data Science</title>
    <link>https://communities.sas.com/t5/SAS-Data-Science/Why-is-the-range-of-Term-Density-cut-off-in-SAS-Contextual/m-p/445695#M9901</link>
    <description>&lt;P&gt;I have posted this question to SAS Tech Support and got the following.&amp;nbsp;&lt;/P&gt;&lt;P&gt;Will post it here for anyone who is interested . Thank you.&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;The term&amp;nbsp;&amp;nbsp;density actually relates to the number of standard deviations above the mean that the term cutoff is set to for a topic.&amp;nbsp;&amp;nbsp;So generally, with the smallest setting (0.5), you might get 40% of your terms above mean+0.5 standard deviation.&amp;nbsp;&amp;nbsp; Likely you would get well less than 1% if you have a value of 6.&lt;BR /&gt;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
    <pubDate>Thu, 15 Mar 2018 01:58:20 GMT</pubDate>
    <dc:creator>BeiJia</dc:creator>
    <dc:date>2018-03-15T01:58:20Z</dc:date>
    <item>
      <title>Why is the range of Term Density cut-off in SAS Contextual Analysis from 0.5 to 6?</title>
      <link>https://communities.sas.com/t5/SAS-Data-Science/Why-is-the-range-of-Term-Density-cut-off-in-SAS-Contextual/m-p/443201#M9900</link>
      <description>&lt;P&gt;Hi,&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;When coming across Topic Properties during Topic Modelling in SAS Contextual Analysis (Version 14.2), there is an option to adjust the term density.&lt;/P&gt;&lt;P&gt;&lt;span class="lia-inline-image-display-wrapper lia-image-align-inline" image-alt="Topic Properties.png" style="width: 400px;"&gt;&lt;img src="https://communities.sas.com/t5/image/serverpage/image-id/19022iFBFC1B6F591261A2/image-size/medium?v=v2&amp;amp;px=400" role="button" title="Topic Properties.png" alt="Topic Properties.png" /&gt;&lt;/span&gt;&lt;/P&gt;&lt;P&gt;Below is a quote from the SAS CA User guide about this.&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;"&lt;SPAN class="xis-leadinText"&gt;Edit topic properties&lt;/SPAN&gt;&lt;/P&gt;&lt;DIV class="xis-paraSimple"&gt;You can edit the properties that affect all topics.&amp;nbsp;Term density refers to how topics are populated with terms; it is defined by a number between 0.5 and 6 (the default value is 2). When term density is closer to 0.5, topics are more densely populated by terms. When term density is closer to 6, topics are less densely populated by terms. This value affects the number of documents that belong to a topic (for example, having fewer terms in a topic captures fewer documents). Values that you enter are rounded to the nearest integer or half-integer.&lt;/DIV&gt;&lt;P&gt;"&lt;/P&gt;&lt;P&gt;My question is, how is the term density calculated?&lt;/P&gt;&lt;P&gt;Term density usually refers to number of times the term appears in a document as a proportion of the number of words in a document, and this would result in a value between 0 to 1.&lt;/P&gt;&lt;P&gt;Hence why are the options 0.5 to 6?&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Thank you.&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Wed, 07 Mar 2018 08:04:54 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Data-Science/Why-is-the-range-of-Term-Density-cut-off-in-SAS-Contextual/m-p/443201#M9900</guid>
      <dc:creator>BeiJia</dc:creator>
      <dc:date>2018-03-07T08:04:54Z</dc:date>
    </item>
    <item>
      <title>Re: Why is the range of Term Density cut-off in SAS Contextual Analysis from 0.5 to 6?</title>
      <link>https://communities.sas.com/t5/SAS-Data-Science/Why-is-the-range-of-Term-Density-cut-off-in-SAS-Contextual/m-p/445695#M9901</link>
      <description>&lt;P&gt;I have posted this question to SAS Tech Support and got the following.&amp;nbsp;&lt;/P&gt;&lt;P&gt;Will post it here for anyone who is interested . Thank you.&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;The term&amp;nbsp;&amp;nbsp;density actually relates to the number of standard deviations above the mean that the term cutoff is set to for a topic.&amp;nbsp;&amp;nbsp;So generally, with the smallest setting (0.5), you might get 40% of your terms above mean+0.5 standard deviation.&amp;nbsp;&amp;nbsp; Likely you would get well less than 1% if you have a value of 6.&lt;BR /&gt;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Thu, 15 Mar 2018 01:58:20 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Data-Science/Why-is-the-range-of-Term-Density-cut-off-in-SAS-Contextual/m-p/445695#M9901</guid>
      <dc:creator>BeiJia</dc:creator>
      <dc:date>2018-03-15T01:58:20Z</dc:date>
    </item>
  </channel>
</rss>

