BookmarkSubscribeRSS Feed
gabras
Pyrite | Level 9

Hi everybody,

 

i'm new in Text mining. I'm clustering some response, but when i analyze the results i can't figure out with a specific situation that appears in the Distance between Cluster window, which is the following:

DbC_TextMiner.PNG

As you can see there are two clusters, which i highlighted, that are very close to each other.

But the two are very different in meaning. How can it be possible? how can i explain this?

 

Thank you

1 REPLY 1
RussAlbright
SAS Employee

I don't know your data so it is hard for me to interpret your result.  I did experiment on a problem and I see that the results seem reasonable in my plot. More similar clusters "tend" to be plotted closer to one another.

 

That plot is using multidimensional scaling, based on the cluster similarities derived from the cluster means.  The result is a view of high dimensional data in a two-dimensional space. You do lose information when you do this. The optimization is over ALL the cluster pairs so the distance between any particular pair of clusters will not necessarily correspond to their true distance in the higher dimensional space. 

 

Also perhaps if you explore terms that are in common below the initial reported terms you would discover more similarities then you first realized?

 

Russ 


Register today and join us virtually on June 16!
sasglobalforum.com | #SASGF

View now: on-demand content for SAS users

hackathon24-white-horiz.png

2025 SAS Hackathon: There is still time!

Good news: We've extended SAS Hackathon registration until Sept. 12, so you still have time to be part of our biggest event yet – our five-year anniversary!

Register Now

How to choose a machine learning algorithm

Use this tutorial as a handy guide to weigh the pros and cons of these commonly used machine learning algorithms.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 1 reply
  • 1339 views
  • 0 likes
  • 2 in conversation