Clustering by one variable

Matt3 — Mon, 30 Oct 2017 07:57:48 GMT

Hi,

I have no experiance in clustering, thus I would be grateful If anyone could help me to choose optimal method.

I am going to group about 1,5 mln customers by one variable (I ve got more but all of them are highly corellated), aboute 50 % of observation have value 0 in clustering variable.

I am using fastclust procedure:

proc fastclus data=wyn2 least=1 maxc=4;
var zasilenia_za_ost_3m;
run;

What I received is 4 groups with

1499995 in the first group

2 in the second cluster

1 in the third cluster

2 in the fourth cluster.

It`s working that way as well even if I remove observation with 0 in clust var.

Thus, my question is which method/procedure would be the best in this case?

Thank you.

Re: Clustering by one variable

Reeza — Mon, 30 Oct 2017 15:08:09 GMT

It doesn't matter, you have a single variable. Look at distribution plots and make your cut off points using some common sense is probably a better approach.

topic Clustering by one variable in Statistical Procedures

Clustering by one variable

Re: Clustering by one variable