Text mining and content categorization

SAS Text Miner- Text Topic Results Question

Accepted Solution Solved
Reply
New Contributor
Posts: 3
Accepted Solution

SAS Text Miner- Text Topic Results Question

Hi all,

I feel like this is an easy question that i just can't find an answer to.  In my text topic node results, it returns 15 topics each topic containing 5 words.  In the results set there is also a column labeled "Number of Terms" which has a value of 1-4 - most of the 15 topics show a value of 1 in this column, a few 2s, 3s and 4s.  What is the significance of this column "Number of terms"? What does this value mean? It does not coorelate with the number of terms shown in the actual topic column.  Appreciate any help!


Accepted Solutions
Solution
‎08-16-2016 02:25 PM
SAS Employee
Posts: 31

Re: SAS Text Miner- Text Topic Results Question

I think that value is the number of terms that had a weight above the termcutoff threshold. Is it possible that you have 15 topics but your data set is so small that not many terms made the cutoff? I just looked at a topics table and the values there are much larger than 5.

View solution in original post


All Replies
SAS Employee
Posts: 106

Re: SAS Text Miner- Text Topic Results Question

Hi, Mrs. Mee. 

 

The TT node displays 5 terms per topic but these are only the top 5. There could be many more terms associated with a given topic. 

 

I think the TT node excludes terms that do not meet the term cutoff (which you can view and edit in the Topic Viewer). 

 

To confirm, try opening the Topic Viewer (available in Node Properties for the TT node) and look at the Terms table. How many terms do you see there for each topic? How many of them meet the Term Cutoff in the Topics table? 

 

Hope this helps. 

 

Ray

New Contributor
Posts: 3

Re: SAS Text Miner- Text Topic Results Question

Thanks for the reply.  I am OK with the number of terms per topic, where I am confused is the counter value under the label of "Number of terms".  You would think it would coorelate with the top 5 and read 5 but it does not.  It is always less that 5.  I am trying to understand the definition for this particular value column.

Thanks!

SAS Employee
Posts: 106

Re: SAS Text Miner- Text Topic Results Question

Hi. Have you tried exploring your topics in the Topic Viewer? It is a different window than the results for the Text Topic node.  You'll find it in node properties and Help should have an example. 

 

Ray

Solution
‎08-16-2016 02:25 PM
SAS Employee
Posts: 31

Re: SAS Text Miner- Text Topic Results Question

I think that value is the number of terms that had a weight above the termcutoff threshold. Is it possible that you have 15 topics but your data set is so small that not many terms made the cutoff? I just looked at a topics table and the values there are much larger than 5.

New Contributor
Posts: 3

Re: SAS Text Miner- Text Topic Results Question

Thanks Russ!!  Yes, that is entirely possible and the number would make sense if that were the case.

Appreciate your feedback!

☑ This topic is solved.

Need further help from the community? Please ask a new question.

Discussion stats
  • 5 replies
  • 538 views
  • 0 likes
  • 3 in conversation