Hi! I'm working with the Model Studio Clustering node to segment a small database of 70 rows and 35 columns. Except for the ID, all columns are interval variables that were previously standardized. My pipeline is extremely simple and looks like this:
The results from the clustering node shows 5 clusters as the optimal number:
However, when exporting the data (wither from the Otput Data tab of the node, or using a Sve Data or Score Data), all rows display a null value in the _CLUSTER_ID_:
What could be causing the issue?
I have moved this post to 'Data Mining and Machine Learning' board (where it belongs).
Koen
Hello,
I would have to investigate.
What you see is weird and not normal behaviour of course.
But before reproducing (or trying to) in Model Studio, ... this question or remark :
The Model Studio VDMML clustering is built for big data. I'm not sure if it will react well on ( only ! ) 70 records with 35 variables.
If I would do the same, I would do it with a procedure (or with a task in SAS Studio).
Procedures that you can use are :
Good luck,
Koen
April 27 – 30 | Gaylord Texan | Grapevine, Texas
Walk in ready to learn. Walk out ready to deliver. This is the data and AI conference you can't afford to miss.
Register now and save with the early bird rate—just $795!
Use this tutorial as a handy guide to weigh the pros and cons of these commonly used machine learning algorithms.
Find more tutorials on the SAS Users YouTube channel.