I create a n clusters using SAS miner HP cluster nodes( K means ). but every time I try to replicate the same clusters it give a different clusters. even using EG with different initializations give me different clusters. my questions are :
1- Is there a way to fix this clusters and make it my work replicable?
2- If I can't fixe my clusters is there a way to test the stability of my clusters using for example an overlap rate and said after 75 % we can said that the clusters are stable?
3- I couldn't find any straight forward answer for the stability of the clusters and how it's important. can we speak about the stability of clustering in this situation? is it very important to test the stability before use the clusters? which measures can do that ? is there any nodes in sas miner can do that?
I'm a little bit lost with this question of the stability. thank you for your understanding !!
K-Means clustering doesn't have a single unique solution, more so, there's a set of possible solutions and it's about picking one that makes the most sense for your use case. Especially if you change the initialization parameters then the clusters will be different.
If your clusters are unstable it means your clusters are possibly not unique enough and you should reduce the number of clusters to get a more stable solution. How did you pick the number of clusters?
here's a picture of selection # of clusters using ABC selection
Good news: We've extended SAS Hackathon registration until Sept. 12, so you still have time to be part of our biggest event yet – our five-year anniversary!
ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.
Find more tutorials on the SAS Users YouTube channel.