How do I calculate the silhouette coefficient, and what's the code for that? Or an example I can follow?
Seems to be covered here:
https://www.sas.com/content/dam/SAS/support/en/sas-global-forum-proceedings/2019/3409-2019.pdf
Which tool are you using and what exact version? Some will have this and some you'll have to do your own calculations.
Thanks
Cluster k means para sas enterprise guide 7.1
proc product_status; run;
For Base Product ...
Custom version information: 9.21_M3
Image version information: 9.02.02M3P032410
For SAS/STAT ...
Custom version information: 9.22
Image version information: 9.02.02M0P033110
For SAS/GRAPH ...
Custom version information: 9.21_M2
For SAS Integration Technologies ...
Custom version information: 9.2_M2
For SAS/ACCESS Interface to PC Files ...
Custom version information: 9.21_M2
we are currently in the process of updating, from which version can I find the silhouette coefficient?
Do you have an example that I can follow in a more current version?
thank you
To my knowledge the Silhouette Coefficient isn't calculated as part of any SAS procedure or SAS Viya action. When I wrote the SAS Global Forum paper my goal was to discuss a lot of the methods that people use to evaluate clustering, but not necessarily limit myself to methods that are supported within the software.
There are some definite down-sides to the Silhouette Coefficient in that 1) you shouldn't use it to compare two different types of clustering - it's very much related to centroid based clustering methods 2) it is a time consuming metric to calculate if you have large numbers of observations.
That being said, I think that one of the better ways to implement if you are looking to do so in SAS would be by using SAS IML. Because the Silhouette Coefficient looks across all pairs of observations the matrix setup of IML would be an easier programming approach for this type of problem (as compared to Data Step - which is still doable).
Rick Wicklin has recently written a wonderful 2-part blog post about this.
https://blogs.sas.com/content/iml/2023/05/15/silhouette-statistic-cluster.html
https://blogs.sas.com/content/iml/2023/05/17/compute-silhouette-sas.html
Hopefully this can help.
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
What’s the difference between SAS Enterprise Guide and SAS Studio? How are they similar? Just ask SAS’ Danny Modlin.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.