BookmarkSubscribeRSS Feed
Matt3
Quartz | Level 8

Hi,

I have no experiance in clustering, thus I would be grateful If anyone could help me to choose optimal method.

I am going to  group about 1,5 mln customers by one variable (I ve got more but all of them are highly corellated), aboute 50 % of observation have value 0 in clustering variable.

 

I am using fastclust procedure:

proc fastclus data=wyn2 least=1 maxc=4;
var zasilenia_za_ost_3m;
run;

 

What I received is 4 groups with

1499995 in the first group 

2 in the second cluster

1 in the third cluster

2 in the fourth cluster.

It`s working that way as well even if I remove observation with 0 in clust var.

 

Thus, my question is which method/procedure would be the best in this case?

Thank you.

 

 

 

 

 

 

 

 

 

1 REPLY 1
Reeza
Super User

It doesn't matter, you have a single variable. Look at distribution plots and make your cut off points using some common sense is probably a better approach. 

 

 

hackathon24-white-horiz.png

2025 SAS Hackathon: There is still time!

Good news: We've extended SAS Hackathon registration until Sept. 12, so you still have time to be part of our biggest event yet – our five-year anniversary!

Register Now

What is ANOVA?

ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 1 reply
  • 1572 views
  • 0 likes
  • 2 in conversation