BookmarkSubscribeRSS Feed
Matt3
Quartz | Level 8

Hi,

I have no experiance in clustering, thus I would be grateful If anyone could help me to choose optimal method.

I am going to  group about 1,5 mln customers by one variable (I ve got more but all of them are highly corellated), aboute 50 % of observation have value 0 in clustering variable.

 

I am using fastclust procedure:

proc fastclus data=wyn2 least=1 maxc=4;
var zasilenia_za_ost_3m;
run;

 

What I received is 4 groups with

1499995 in the first group 

2 in the second cluster

1 in the third cluster

2 in the fourth cluster.

It`s working that way as well even if I remove observation with 0 in clust var.

 

Thus, my question is which method/procedure would be the best in this case?

Thank you.

 

 

 

 

 

 

 

 

 

1 REPLY 1
Reeza
Super User

It doesn't matter, you have a single variable. Look at distribution plots and make your cut off points using some common sense is probably a better approach. 

 

 

SAS Innovate 2025: Save the Date

 SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!

Save the date!

What is ANOVA?

ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 1 reply
  • 1393 views
  • 0 likes
  • 2 in conversation