@ycenycute -- The thing to understand about any such cluster selection approach ("elbow", CCC, ABC, etc...) is that there is no "right" answer. All approaches effectively attempt to identify where the value of creating a larger number of clusters provides a smaller return in "value". Since there is no right answer for the "correct" number of clusters, it is common to generate several cluster solutions and evaluate the usefulness of each clustering solution in light of your business/research questions of interest. Understanding the nuances of how each approach to identifying good candidate solutions would require an understanding of the mathematics used in generating any statistic used both in the clustering and in the assessment of the clusters. For example, use of a distance metric based on squaring the deviations might give a very different clustering than simply taking the absolute value of the deviations in which larger deviations are not penalized as greatly. Even if you have a good understanding of the those metrics, you must consider any candidate solution in light of the original research/business question.
It would be entirely expected for two people with the same data set but different business needs to settle on completely different cluster solutions as ideal. For example, someone wanting to identify non-trivial group sizes for the purposes of marketing might tend toward a smaller number of clusters and might even ignore outliers to better separate the people in the middle of the pack to keep each market segment nontrivial. However, someone looking at the same data and trying to understand new market opportunities might be willing to create a larger number of clusters so they could look toward the small clusters at the fringes which though small are emerging over time to identify new areas of opportunity. In either case, there might not be a particular metric that chooses the ultimate cluster solution for the business problem. The metrics get us closer to identifying good candidates but it is always good to look at a range of nearby solutions in order to better identify the best cluster solution for a particular business problem.
Another thing to consider is that cluster solutions depends on the variables that are included, so adding a variable or subtracting a variable changes the potential solution. If you try and put a bunch of variables into a single cluster solution, chances are there are only a small subset of those variables which are really driving the clustering. In many cases, it makes more sense to create several cluster solutions for different subsets of variables that are reasonably considered together. For example, suppose you had information related to recency of purchases, frequency of purchases, and amount of purchases over various time windows (e.g. over the last 30, 60, 90, 180, 360 days). Rather than cramming all of the variables into one cluster solution, it might be far more effective to cluster each of the subgroups of variables separately. You could then build a profile for each potential buyer based on the cluster prediction from each of the three cluster solutions (Recency, Frequency, Monetary) which would build a clearer picture of your candidates. Again, since there is no "correct" cluster solution, you can build any such candidate cluster solutions based on your particular business need. The choice among them in the end is more likely to be driven by the business/research question than by any particular metric.
I hope this helps! Cordially, Doug
... View more