How to profile and interpret clusters?

kinoo1989 · Posted 11-28-2018 03:45 PM

Hello, I have 77 variables and 27,000 observations. My goal is to find meaningful clusters out of it. I am finding it challenging to interpret the clusters!!

What I tried so far is, I performed PCA (using proc Princomp), which gave me an idea of reduced dimension. Then I used the relevant PC's in the Fastclus operations - after few iterations, I found an output that produced the desired number of significant clusters.

Then, I set the original input variables with the produced clusters I did it as I thought it will enable me to make sense of the clusters in terms of the original variables, even though the PCs were used for deriving clusters.

My problem is how do I profile the clusters to understand their business significance (interpretation) - I tried using Proc Tabulate but it didn't make sense either because I have 77 original variables to compare with my cluster.

What should be the next right step - should I try to check multi-collinearity and remove as many variables I can or there is an easier way?? I would appreciate any kind of feedback or tips to resolve this issue.

Thank You in advance

Regards

Kino

Ksharp · Posted 11-29-2018 08:24 AM

If you want cluster variables ,check PROC VARCLUS.

If you want pick up the most significant variables ,check PROC PLS or PROC HPGENSELECT.

How to profile and interpret clusters?

Re: How to profile and interpret clusters?

Registration is open