03-31-2017 05:32 AM
How would I identify important variables before doing any cluster analysis, I have a big data base of customer (purchase,loyalty..), 150 feature.
I want to work in two step: 1- Eliminate feature which is unuseful for clustering, e.g. an equivalent of correlation analyse between target and feature in supervised clustering. 2-Eliminate correlated feature, use cluster of variables, PCA. Then i do the classification.
I'm very well in the second step, but i have no idea for the first step, what is the equivalent in unsupervised clustering? which exploratory analysis can help me?