Text Cluster node in Text Miner peforms SVD before clustering. Can anyone tell me the advantage of SVD here?
I think this oft-cited paper (http://www.cc.gatech.edu/~vempala/papers/dfkvv.pdf) describes it as well as it can be explained. Basically, they talk about how clustering the SVD solves an approximate clustering solution for the actual dataset, with much better performance. So it's probably that performance boost that is the primary explanation.
I dug a little deeper and this discussion really does a great job: starting with PCA and moving onto SVD: https://www.cs.princeton.edu/picasso/mats/PCA-Tutorial-Intuition_jp.pdf
A more brief Q&A that is quite nice is here: https://www.quora.com/What-is-an-intuitive-explanation-of-the-relation-between-PCA-and-SVD
Hope that helps!
Join us for SAS Innovate 2025, our biggest and most exciting global event of the year, in Orlando, FL, from May 6-9. Sign up by March 14 for just $795.
Use this tutorial as a handy guide to weigh the pros and cons of these commonly used machine learning algorithms.
Find more tutorials on the SAS Users YouTube channel.