for K-means cluster analysis, one can use proc fastclus like
proc fastclus data=mydata out=out maxc=4 maxiter=20;
and change the number defined by maxc=, and run a number of times, then compare the Pseduo F and CCC values, to see which number of clusters gives peaks
or one can use proc cluster:
PROC CLUSTER data=mydata METHOD=WARD out=out ccc pseudo print=15;
to find the number of clusters with pesudo F, t2 and ccc.
and also look at junp in Semipartial R-Square.
sometimes these indications do not agree to each other. which indicator is more reliable? Thanks!
If you are doubting between 2 k-values, you can use Beale's F-type statistic to determine the final number of clusters. It will tell you whether the larger solution is significantly better or not (in the latter case the solution with fewer clusters is preferable).
This technique is discussed in the "Applied Clustering Techniques" course notes.
You can also try something relatively new.
Tip: K-means clustering in SAS - comparing PROC FASTCLUS and PROC HPCLUS
For numeric variables, PROC HPCLUS provides the convenient NOC=ABC option to auto-select the number of clusters k based on the aligned box criterion (ABC). For each k value from MINCLUSTERS (default to 2) to MAXCLUSTERS, ABC compares the within-cluster dispersion of the results to that of a simulated reference distribution, and selects a value of k where the within-cluster dispersions of the data results and the reference distribution differ greatly.
An Overview of Machine Learning with SAS® Enterprise Miner™
Patrick Hall, Jared Dean, Ilknur Kaynar Kabul, Jorge Silva
SAS Institute Inc.
Find HPCLUS and ABC (keywords).
PROC HPCLUS is one of many High-Performance Procedures in SAS Enterprise MIner 13.2 and beyond.
Registration is open! SAS is returning to Vegas for an AI and analytics experience like no other! Whether you're an executive, manager, end user or SAS partner, SAS Innovate is designed for everyone on your team. Register for just $495 by 12/31/2023.
If you are interested in speaking, there is still time to submit a session idea. More details are posted on the website.
Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.
Find more tutorials on the SAS Users YouTube channel.