Proc fastclus: Relative importance of Variables within Clusters

Mgarret — Wed, 18 Oct 2017 15:59:59 GMT

All,

I’m using Proc fastclus in SAS to perform a cluster analysis. I’m trying to figure out a way to determine the relative importance of variables within a cluster. So, what variables are the primary drivers within a cluster or variables have the most predictive power, so to speak. And rank order.

Here is an example .

Using the variables A, B, C, D, E, F I build a cluster model with 3 segments.

   proc fastclus data= DATA_SET maxc=3  out=CUSTER_Results ;
      var A B C D E F ;run;

I want to rank order the variables by relative importance within each cluster.

Cluster 1

Rank order of variable Importance:

B – Primary Driver of segment (most predictive)
D
A
C
E
F – Least predictive

Cluster 2

Rank order of variable Importance:

A – Primary Driver of segment (most predictive
C
F
E
B
D – Least predictive

Cluster 3

Rank order of variable Importance:

D – Primary Driver of segment (most predictive
A
C
B
F
E – Least predictive

Is there an option for proc fastclus which will do this automatically? In not, any recommendations on how to determine the predictive rank order?

Thanks.

Re: Proc fastclus: Relative importance of Variables within Clusters

ballardw — Wed, 18 Oct 2017 21:59:41 GMT

First thing to consider from the documentation:

PROC FASTCLUS uses algorithms that place a larger influence on variables with larger variance, so it might be necessary to standardize the variables before performing the cluster analysis.

So did you examine your variables before the clustering to identify the variance, or differences in variance, between your variables?

topic Re: Proc fastclus: Relative importance of Variables within Clusters in Statistical Procedures

Proc fastclus: Relative importance of Variables within Clusters

Re: Proc fastclus: Relative importance of Variables within Clusters