Hello
I am still searching for a good dataset showcasing variable clustering in VDMML....maybe showing three or four clusters of variables to reduce multicollinearity on numeric variables.
any assistance would be appreciated!
Hi Ghabek. You can get three clusters with the data set pva_raw_data. This data is used in the EM Applied Analytics course. I had to play with the Variable Clustering node properties to get the three clusters. My changes from default: Use default maximum number of variables per cluster: De-select checkbox, Number of variables per cluster upper threshold = 8, Clustering Rho value = 0.7.
Hello
I ran it with your settings and still get one cluster for the variables....Im using Average Gift as the interval target to run the pipeline with the variable clustering node and your settings.
I get one cluster
I got the datatset from the web of 7600 records.......
any ideas why I cant get three clusters?
thanks!!
Yours is a different data set than that provided in the SAS course. If you could, please contact SAS to request the data.
Don’t miss the livestream kicking off May 7. It’s free. It’s easy. And it’s the best seat in the house.
Join us virtually with our complimentary SAS Innovate Digital Pass. Watch live or on-demand in multiple languages, with translations available to help you get the most out of every session.
Use this tutorial as a handy guide to weigh the pros and cons of these commonly used machine learning algorithms.
Find more tutorials on the SAS Users YouTube channel.