SAS Enterprise Guide

Desktop productivity for business analysts and programmers
BookmarkSubscribeRSS Feed
ryesidariza
Calcite | Level 5

How do I calculate the silhouette coefficient, and what's the code for that? Or an example I can follow?

9 REPLIES 9
Reeza
Super User

Which tool are you using and what exact version? Some will have this and some you'll have to do your own calculations.

ryesidariza
Calcite | Level 5

Thanks

 

Cluster k means para sas enterprise guide 7.1

Reeza
Super User
Can you run the following and post the output from the log? SAS versioning is different than EG versioning.

proc product_status;run;
Diego_L
Calcite | Level 5

proc product_status; run;

For Base Product ...
Custom version information: 9.21_M3
Image version information: 9.02.02M3P032410
For SAS/STAT ...
Custom version information: 9.22
Image version information: 9.02.02M0P033110
For SAS/GRAPH ...
Custom version information: 9.21_M2
For SAS Integration Technologies ...
Custom version information: 9.2_M2
For SAS/ACCESS Interface to PC Files ...
Custom version information: 9.21_M2

Reeza
Super User
You're on a really old version (9.2 was released in 2008) so I doubt it has that measurement. Unfortunately that means you'll need to calculate it manually. Otherwise, try PROC SCORE or PLM for starting points.

Diego_L
Calcite | Level 5

we are currently in the process of updating, from which version can I find the silhouette coefficient?
Do you have an example that I can follow in a more current version?
thank you

RalphAbbey
SAS Employee

To my knowledge the Silhouette Coefficient isn't calculated as part of any SAS procedure or SAS Viya action. When I wrote the SAS Global Forum paper my goal was to discuss a lot of the methods that people use to evaluate clustering, but not necessarily limit myself to methods that are supported within the software.

 

There are some definite down-sides to the Silhouette Coefficient in that 1) you shouldn't use it to compare two different types of clustering - it's very much related to centroid based clustering methods 2) it is a time consuming metric to calculate if you have large numbers of observations.

 

That being said, I think that one of the better ways to implement if you are looking to do so in SAS would be by using SAS IML. Because the Silhouette Coefficient looks across all pairs of observations the matrix setup of IML would be an easier programming approach for this type of problem (as compared to Data Step - which is still doable).

sas-innovate-white.png

Our biggest data and AI event of the year.

Don’t miss the livestream kicking off May 7. It’s free. It’s easy. And it’s the best seat in the house.

Join us virtually with our complimentary SAS Innovate Digital Pass. Watch live or on-demand in multiple languages, with translations available to help you get the most out of every session.

 

Register now!

Creating Custom Steps in SAS Studio

Check out this tutorial series to learn how to build your own steps in SAS Studio.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 9 replies
  • 4498 views
  • 4 likes
  • 5 in conversation