BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
BeiJia
Calcite | Level 5

Hi,

 

When coming across Topic Properties during Topic Modelling in SAS Contextual Analysis (Version 14.2), there is an option to adjust the term density.

Topic Properties.png

Below is a quote from the SAS CA User guide about this. 

 

"Edit topic properties

You can edit the properties that affect all topics. Term density refers to how topics are populated with terms; it is defined by a number between 0.5 and 6 (the default value is 2). When term density is closer to 0.5, topics are more densely populated by terms. When term density is closer to 6, topics are less densely populated by terms. This value affects the number of documents that belong to a topic (for example, having fewer terms in a topic captures fewer documents). Values that you enter are rounded to the nearest integer or half-integer.

"

My question is, how is the term density calculated?

Term density usually refers to number of times the term appears in a document as a proportion of the number of words in a document, and this would result in a value between 0 to 1.

Hence why are the options 0.5 to 6? 

 

Thank you. 

 

1 ACCEPTED SOLUTION

Accepted Solutions
BeiJia
Calcite | Level 5

I have posted this question to SAS Tech Support and got the following. 

Will post it here for anyone who is interested . Thank you. 

 

The term  density actually relates to the number of standard deviations above the mean that the term cutoff is set to for a topic.  So generally, with the smallest setting (0.5), you might get 40% of your terms above mean+0.5 standard deviation.   Likely you would get well less than 1% if you have a value of 6.

 

View solution in original post

1 REPLY 1
BeiJia
Calcite | Level 5

I have posted this question to SAS Tech Support and got the following. 

Will post it here for anyone who is interested . Thank you. 

 

The term  density actually relates to the number of standard deviations above the mean that the term cutoff is set to for a topic.  So generally, with the smallest setting (0.5), you might get 40% of your terms above mean+0.5 standard deviation.   Likely you would get well less than 1% if you have a value of 6.

 

sas-innovate-2024.png

Don't miss out on SAS Innovate - Register now for the FREE Livestream!

Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.

 

Register now!

How to choose a machine learning algorithm

Use this tutorial as a handy guide to weigh the pros and cons of these commonly used machine learning algorithms.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 1 reply
  • 1006 views
  • 0 likes
  • 1 in conversation