BookmarkSubscribeRSS Feed
Matt3
Quartz | Level 8

Hi,

I have no experiance in clustering, thus I would be grateful If anyone could help me to choose optimal method.

I am going to  group about 1,5 mln customers by one variable (I ve got more but all of them are highly corellated), aboute 50 % of observation have value 0 in clustering variable.

 

I am using fastclust procedure:

proc fastclus data=wyn2 least=1 maxc=4;
var zasilenia_za_ost_3m;
run;

 

What I received is 4 groups with

1499995 in the first group 

2 in the second cluster

1 in the third cluster

2 in the fourth cluster.

It`s working that way as well even if I remove observation with 0 in clust var.

 

Thus, my question is which method/procedure would be the best in this case?

Thank you.

 

 

 

 

 

 

 

 

 

1 REPLY 1
Reeza
Super User

It doesn't matter, you have a single variable. Look at distribution plots and make your cut off points using some common sense is probably a better approach. 

 

 

sas-innovate-2026-white.png



April 27 – 30 | Gaylord Texan | Grapevine, Texas

Registration is open

Walk in ready to learn. Walk out ready to deliver. This is the data and AI conference you can't afford to miss.
Register now and lock in 2025 pricing—just $495!

Register now

What is ANOVA?

ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 1 reply
  • 1650 views
  • 0 likes
  • 2 in conversation