Hello
I am still searching for a good dataset showcasing variable clustering in VDMML....maybe showing three or four clusters of variables to reduce multicollinearity on numeric variables.
any assistance would be appreciated!
Hi Ghabek. You can get three clusters with the data set pva_raw_data. This data is used in the EM Applied Analytics course. I had to play with the Variable Clustering node properties to get the three clusters. My changes from default: Use default maximum number of variables per cluster: De-select checkbox, Number of variables per cluster upper threshold = 8, Clustering Rho value = 0.7.
Hello
I ran it with your settings and still get one cluster for the variables....Im using Average Gift as the interval target to run the pipeline with the variable clustering node and your settings.
I get one cluster
I got the datatset from the web of 7600 records.......
any ideas why I cant get three clusters?
thanks!!
Yours is a different data set than that provided in the SAS course. If you could, please contact SAS to request the data.
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Use this tutorial as a handy guide to weigh the pros and cons of these commonly used machine learning algorithms.
Find more tutorials on the SAS Users YouTube channel.