BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
DavidBesaev
Fluorite | Level 6

Hi! 

It's my first encounter with the CCC. I'm trying to figure out the outflow model. I am a beginner and met this clustering assessment. Can you explain in simple terms how best to interpret this estimate?
I'm not very good at English specialized literature, find SAS TR A-108, but can't understand main point.

image.png

Thank you!

1 ACCEPTED SOLUTION

Accepted Solutions
MikeStockstill
SAS Employee

Hello DavidBesaev -

 

Here is a link to the technical report that you mentioned.

 

A-108 Cubic Clustering Criterion

http://support.sas.com/kb/22/addl/fusion_22540_1_a108_5903.pdf

 

The best place to look for information about how to interpret is in the Conclusion section, printed page 49. Here is a brief summarization:

 

  • Peaks in the plot of the cubic clustering criterion with values greater than 2 or 3 indicate good clusters;
  • Peaks with values between 0 and 2 indicate possible clusters.
  • Large negative values of the CCC can indicate outliers.

 

Pages 40-48 give some examples of interpretations.

 

Another good place to look for interpretation examples is the Getting Started section, and the Examples section, of the chapter The CLUSTER Procedure.

 

        SAS/STAT User's Guide - Procedures

        https://support.sas.com/documentation/onlinedoc/stat/indexproc.html

 

     

Have a great day.

View solution in original post

3 REPLIES 3
MikeStockstill
SAS Employee

Hello DavidBesaev -

 

Here is a link to the technical report that you mentioned.

 

A-108 Cubic Clustering Criterion

http://support.sas.com/kb/22/addl/fusion_22540_1_a108_5903.pdf

 

The best place to look for information about how to interpret is in the Conclusion section, printed page 49. Here is a brief summarization:

 

  • Peaks in the plot of the cubic clustering criterion with values greater than 2 or 3 indicate good clusters;
  • Peaks with values between 0 and 2 indicate possible clusters.
  • Large negative values of the CCC can indicate outliers.

 

Pages 40-48 give some examples of interpretations.

 

Another good place to look for interpretation examples is the Getting Started section, and the Examples section, of the chapter The CLUSTER Procedure.

 

        SAS/STAT User's Guide - Procedures

        https://support.sas.com/documentation/onlinedoc/stat/indexproc.html

 

     

Have a great day.

DavidBesaev
Fluorite | Level 6

Thank you so much for your help, MikeStockstill !

I will try to get to know these sources more closely.

So, if we look at my plot of CCC, that good performance will be, if there are more than 4 clusters ( more 2000 points)?

 

image.png

Thank you! 

DougWielenga
SAS Employee

 

So, if we look at my plot of CCC, that good performance will be, if there are more than 4 clusters ( more 2000 points)?

 

image.png

 

It is best to review the report since the CCC is just one way to evaluate a candidate number of clusters, and there are situation where the CCC might not be the best statstic to use.  The goal of clustering is typically to provide interpretable and/or usable results for your analysis needs.  Think of the CCC plot as recommending a range of cluster solutions that might be useful and you can then compare the competing solutions for which one best meets those needs.  

 

When I see the CCC increasing slowly over the larger number of clusters, I would expect the additional splits to be pulling off clusters with small numbers of observations which can be useful if you are trying to isolate unusual potentially fraudulent cases but is not helpful if you are doing marketing where small clusters are not large enough to warrant special treatment.  Check multiple cluster solutions and choose what is best for your scenario.

 

Hope this helps!

Doug

Ready to join fellow brilliant minds for the SAS Hackathon?

Build your skills. Make connections. Enjoy creative freedom. Maybe change the world. Registration is now open through August 30th. Visit the SAS Hackathon homepage.

Register today!
How to choose a machine learning algorithm

Use this tutorial as a handy guide to weigh the pros and cons of these commonly used machine learning algorithms.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 3 replies
  • 14091 views
  • 1 like
  • 3 in conversation