BookmarkSubscribeRSS Feed
aha123
Obsidian | Level 7

For Text Topic node, the default setting for # of topics is 25. I would like to know how to determine the right number. For example, can I base on the # of clusters I get from Text Cluster node to decide how many topics I should set for Text Topic node?

1 REPLY 1
rayIII
SAS Employee

Hi.

 

I don't think the TT node provides much in the way of guidance, but you might have a look at the HP Text Miner node (HPDM tab) if you have access to it. It gives various options for selecting the number of topics based on percentage of total variance accounted for. 

 

Here's a snippet from Help: 

 

"Suppose that the maximum number of SVD dimensions that you specify for the Max SVD Dimensions property is maxdim, and these maxdim SVD dimensions account for p% of the total variance. High resolution always generates the maximum number of SVD dimensions (maxdim). For medium resolution, the recommended number of SVD dimensions accounts for 5/6*(p% of the total variance). For low resolution, the recommended number of SVD dimensions accounts for 2/3*(p% of the total variance)."

 

(You could also try posting to the Text Analytics Community as they are the text mining experts)

 

Ray

sas-innovate-2024.png

Don't miss out on SAS Innovate - Register now for the FREE Livestream!

Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.

 

Register now!

How to choose a machine learning algorithm

Use this tutorial as a handy guide to weigh the pros and cons of these commonly used machine learning algorithms.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 1 reply
  • 946 views
  • 0 likes
  • 2 in conversation