BookmarkSubscribeRSS Feed
Edoedoedo
Pyrite | Level 9

Hi,

 

I need to perform a DBSCAN clustering on a dataset loaded in CAS. As far as I read, there is no DBSCAN algorithm implemented neither in CAS nor in SAS 9.4. The only clustering method I found in CAS is the K-MEANS but that's inappropriate for my application since I'm working on anomaly detection.

 

Did I miss something? If DBSCAN is indeed not implemented anywhere in SAS/CAS, what would you recommend me to do? Develop it in CASL? Develop it in SAS Base? Develop it in Python with SWAT?

 

Thank you for any advice.

PS: the DBSCAN implementation should be with high performance, my dataset has a dozen features and some million rows; I tried the sklearn DBSCAN on my machine and it takes forever, I need to use CAS distributed environment I guess.

2 REPLIES 2
gsvolba
SAS Employee

Hi,
in Viya (CAS Server) you also find a list of additional procedures that are relevant for anomaly detection

    • Robust Principal Component Analysis (RPCA):
    • Support Vector Data Description (SVDD):
    • Isolation Forest (PROC FOREST):
    • Autoencoder (PROC NNET):
    • PROC SEMISUPLEARN
    • Also KCLUS Procedure (allows to use Nominal Variables)
      • The KCLUS procedure uses the k-means algorithm for clustering interval input variables, uses the k-modes algorithm for clustering nominal input variables, and uses k-prototypes algorithm for clustering mixed input that contains both interval and nominal variables.

Best regards,

Gerhard


Register today and join us virtually on June 16!
sasglobalforum.com | #SASGF

View now: on-demand content for SAS users

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

How to choose a machine learning algorithm

Use this tutorial as a handy guide to weigh the pros and cons of these commonly used machine learning algorithms.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 2 replies
  • 2147 views
  • 1 like
  • 3 in conversation