The CCC is a statistic created by Warren Sarle of SAS nearly 30 years ago. It is documented in Technical Report A-108. On page 48 he writes, "If all values of the CCC are negative and decreasing for two or more clusters, the distribution is probably unimodal or long-tailed." He goes on to say that very negative values may be due to outliers, which he recommends removing (not my recommended best practice). In my experience, the CCC is a heuristic that needs to be triangulated with the approximate R2 as well as the distribution of the cluster frequencies. For the CCC and R2, you want to look at their distribution across a set of solutions (e.g., wrap FASTCLUS in a macro and run solutions from 3 to 30) and examine solutions that have max values for those statistics, even when the CCC is negative. Clusters that are highly irregularly distributed or have 1 or 2 clusters that are large with several small clusters are not appropriate and do not lead to good solutions. In addition, it's important to note that FASTCLUS is a k-means algorithm, meaning that the clusters it produces are compact and spherical in shape. If the shape of your clusters is irregular, you may want to consider a different algorithm, e.g., a nonparametric approach.
... View more