06-23-2017 10:25 AM
i'm new in Text mining. I'm clustering some response, but when i analyze the results i can't figure out with a specific situation that appears in the Distance between Cluster window, which is the following:
As you can see there are two clusters, which i highlighted, that are very close to each other.
But the two are very different in meaning. How can it be possible? how can i explain this?
06-26-2017 10:41 PM
I don't know your data so it is hard for me to interpret your result. I did experiment on a problem and I see that the results seem reasonable in my plot. More similar clusters "tend" to be plotted closer to one another.
That plot is using multidimensional scaling, based on the cluster similarities derived from the cluster means. The result is a view of high dimensional data in a two-dimensional space. You do lose information when you do this. The optimization is over ALL the cluster pairs so the distance between any particular pair of clusters will not necessarily correspond to their true distance in the higher dimensional space.
Also perhaps if you explore terms that are in common below the initial reported terms you would discover more similarities then you first realized?