Solved
New Contributor
Posts: 2

# Why is the range of Term Density cut-off in SAS Contextual Analysis from 0.5 to 6?

[ Edited ]

Hi,

When coming across Topic Properties during Topic Modelling in SAS Contextual Analysis (Version 14.2), there is an option to adjust the term density.

"Edit topic properties

You can edit the properties that affect all topics. Term density refers to how topics are populated with terms; it is defined by a number between 0.5 and 6 (the default value is 2). When term density is closer to 0.5, topics are more densely populated by terms. When term density is closer to 6, topics are less densely populated by terms. This value affects the number of documents that belong to a topic (for example, having fewer terms in a topic captures fewer documents). Values that you enter are rounded to the nearest integer or half-integer.

"

My question is, how is the term density calculated?

Term density usually refers to number of times the term appears in a document as a proportion of the number of words in a document, and this would result in a value between 0 to 1.

Hence why are the options 0.5 to 6?

Thank you.

Accepted Solutions
Solution
‎03-14-2018 09:58 PM
New Contributor
Posts: 2

## Re: Why is the range of Term Density cut-off in SAS Contextual Analysis from 0.5 to 6?

I have posted this question to SAS Tech Support and got the following.

Will post it here for anyone who is interested . Thank you.

The term  density actually relates to the number of standard deviations above the mean that the term cutoff is set to for a topic.  So generally, with the smallest setting (0.5), you might get 40% of your terms above mean+0.5 standard deviation.   Likely you would get well less than 1% if you have a value of 6.

All Replies
Solution
‎03-14-2018 09:58 PM
New Contributor
Posts: 2

## Re: Why is the range of Term Density cut-off in SAS Contextual Analysis from 0.5 to 6?

I have posted this question to SAS Tech Support and got the following.

Will post it here for anyone who is interested . Thank you.

The term  density actually relates to the number of standard deviations above the mean that the term cutoff is set to for a topic.  So generally, with the smallest setting (0.5), you might get 40% of your terms above mean+0.5 standard deviation.   Likely you would get well less than 1% if you have a value of 6.

☑ This topic is solved.