BookmarkSubscribeRSS Feed
MinalMMurkhande
Calcite | Level 5


Hello Everyone,

    I have a dataset which has about 1000 variables (all are numerical) and is unsupervised(has no target variable). It has a column "zipcode" and my goal is to form meaningful clusters based on this dataset to analyze the association between the zip codes . I was looking to reduce the number of variables (dimensionality reduction) so that I can pass the reduced dataset to PROC Varclus . Is there any effective Procedure for dimensionality reduction for unsupervised datasets? I am using Enterprise Miner and Enterprise Guide. Any related  response would be of great help.  Thankyou!

2 REPLIES 2
M_Maldonado
Barite | Level 11

Hey Minal,

1000 inputs do not seem like a lot, so I think you are good to use the Cluster or HPCluster nodes just on those inputs. I am not very clear on what are you planning to do with the zip codes. Were you planning to run a cluster node on your 1000 inputs and then compare those clusters to your zip codes? Or what was your plan?

You can use the Variable Cluster and the Principal Component nodes in Enterprise Miner for dimension reduction but I am not sure that you need that.

Good luck!

-Miguel

MinalMMurkhande
Calcite | Level 5

Hi Miguel,

    Thank you for your reply. Yes, I was planning to run the cluster node on 1000 inputs and then compare/map the observations with the respective zip codes. FYI, each observation is identified by a unique zip code. This is the only method that I could guess. Is there any other efficient method or procedure for dimensionality reduction in an unsupervised dataset other than using the Cluster node?

-Regards,

Minal.

sas-innovate-2024.png

Don't miss out on SAS Innovate - Register now for the FREE Livestream!

Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.

 

Register now!

How to choose a machine learning algorithm

Use this tutorial as a handy guide to weigh the pros and cons of these commonly used machine learning algorithms.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 2 replies
  • 937 views
  • 0 likes
  • 2 in conversation