Text mining and content categorization

Text Topics Node

Reply
Occasional Learner
Posts: 1

Text Topics Node

There is an option in the Text Topics Node in SAS Enterprise Miner 13.1 that allows the user to have "Correlated Topics" (i.e. the options are either "Yes" or "No"). Now, the SAS Text Topics Node uses the resulting text topics and the singular value decomposition create numeric vectors, and my questions are this:

 

1) What method is used to create the topics (LSA, LDA etc.)

 

2) How does SAS represent the topics? (I assume that they are represented as vectors)

 

3) Most important question: If the correlated option is set to "Yes" then what exactly happens? Are the correlated text topic vectors combined into a single vector ?

 

 

Any insights would be appreciated. 

Frequent Contributor
Posts: 132

Re: Text Topics Node

to the best of my knowledge:

 

1. SVD

2. therefore either rotated or unrotated eigenvectors, depending on 'allow correlated topics' node property setting

3. no, the topics are somewhat correlated whilst within-vector variance is minimised, which in turn increases the likelihood of a human interpreting the topic theme with a more relevant, grammatical phrase, based on the most influential topic terms.

 

If I'm wrong, I'd appreciate a correction.

Ask a Question
Discussion stats
  • 1 reply
  • 450 views
  • 0 likes
  • 2 in conversation