Help using Base SAS procedures

(Magic)Number of Clusters....How do we arrive at desired number of clusters

Posts: 37

(Magic)Number of Clusters....How do we arrive at desired number of clusters

Hello Experts,

Now, I am in a situation where I have to use Hierarchical Cluster analysis but I am not being able to decide the number of clusters. I see Proc ACECLUS which says

"Neither cluster membership nor the number of clusters needs to be known. PROC ACECLUS is useful for preprocessing data to be subsequently clustered by the CLUSTER or FASTCLUS procedure"

But when I see the example provided (LONE example) in documentation section it uses "MAXC=3" option (which is offcourse mandatory requirement of FASTCLUS procedure and is like providing number of cluster explicitly - SAS/STAT(R) 9.2 User's Guide, Second Edition) if it is to be that way then what is the use of running ACECLUS when we are giving the number of clusters explicitly and why then it is quoted in above sentence number of cluster need not to be known. I am confused.

Nevertheless main question  is can we use FASTCLUS or CLUSTER procedure without Prior running ACECLUS (I think the answer is yes). But ACECLUS has got its own importance for calculating canonical variables if our dataset that have variables with different scalar measures. And if we use ACECLUS first, then how to arrive at desired number of clusters given that user is novice and is not aware of different algorithms and methods and business needs etc etc.


Harshad M.

Super User
Posts: 23,778

Re: (Magic)Number of Clusters....How do we arrive at desired number of clusters

Posted in reply to HarshadMadhamshettiwar

I don't think proc cluster requires the number of clusters ahead of time.

There is no hard/fast rule on how to decide the number of clusters. There are some proposed methods - included CCC cubic cluster criterion.

Generally, I would also recommend applying business knowledge to the clustering.

Users should familiarize themselves with the different methods/algorithms/business needs before proceeding to do an analysis.

Ask a Question
Discussion stats
  • 1 reply
  • 2 in conversation