BookmarkSubscribeRSS Feed
Matt3
Quartz | Level 8

Hi,

I have no experiance in clustering, thus I would be grateful If anyone could help me to choose optimal method.

I am going to  group about 1,5 mln customers by one variable (I ve got more but all of them are highly corellated), aboute 50 % of observation have value 0 in clustering variable.

 

I am using fastclust procedure:

proc fastclus data=wyn2 least=1 maxc=4;
var zasilenia_za_ost_3m;
run;

 

What I received is 4 groups with

1499995 in the first group 

2 in the second cluster

1 in the third cluster

2 in the fourth cluster.

It`s working that way as well even if I remove observation with 0 in clust var.

 

Thus, my question is which method/procedure would be the best in this case?

Thank you.

 

 

 

 

 

 

 

 

 

1 REPLY 1
Reeza
Super User

It doesn't matter, you have a single variable. Look at distribution plots and make your cut off points using some common sense is probably a better approach. 

 

 

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

What is ANOVA?

ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 1 reply
  • 1274 views
  • 0 likes
  • 2 in conversation