BookmarkSubscribeRSS Feed
jaredp
Quartz | Level 8

In SAS Text Miner, the Text Cluster node will discover themes and assign each document to one of these themes.  Similarly, the Text Topic node will discover themes but assign each document to zero or more of those themes.

Do any of you have "rules of thumb" when preferring one over the other?

I've come to feel that the Text Cluster node is suited for documents that generally focus on a particular topic because when multiple concepts are present in a document, the chosen theme could be 'biased' (for lack of a better word).  Let me illustrate with an example.

In my customer surveys, respondents will sometimes respond as follows (fictional response):

"Your product could use some improvement.  Here are three suggestions: 1) the colours don't work together or match other products.  2) It's too expensive for the features provided.  3) It's much lager than your competitors."

Say the Text Cluster node determined three themes from the corpus: Improve colour, Improve pricing, and Improve size.  We know the Text cluster node will magically mathematically assign our example comment (document) to one of the above themes.  Picking one ignores the other two items written in the document.  The Text Topic node would likely assign the document to all three themes.

In practice, I actually still use both nodes.  In cases like the above fictional document above, I view the cluster node as 'pragmatic'.  That is, if forced to pick a theme, the most 'appropriate' is picked.

Any suggestions on when to use these nodes?

2 REPLIES 2
jaredp
Quartz | Level 8

These links are good.  It helps me understand topics and SVD much more.  I wonder if the fourth post will ever be written.  It looks like the author took a couple years off from blogging. 

What these links do not do is compare the Text Cluster and Text Topic nodes.

I also liked 2 other articles he linked to.  The first is about synonyms:

http://blogs.sas.com/content/text-mining/2008/12/11/when-are-synonyms-useful/

and the second is a real example of SVD:

http://www.nytimes.com/2008/11/23/magazine/23Netflix-t.html?_r=3&pagewanted=all&

sas-innovate-2024.png

Don't miss out on SAS Innovate - Register now for the FREE Livestream!

Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.

 

Register now!

How to choose a machine learning algorithm

Use this tutorial as a handy guide to weigh the pros and cons of these commonly used machine learning algorithms.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 2 replies
  • 7653 views
  • 1 like
  • 2 in conversation