BookmarkSubscribeRSS Feed
chemicalab
Fluorite | Level 6

Hi all,

Are there any other more advanced and robust ways in SAS Base besides Varclus or principal components that can be used for variable reduction?

I am trying to perform a cluster analysis with over a hundred variable so i was wondering if there is something out there that can help reduce the number of variables as well as providing me with the strongest discriminators for my data.

Kind regards

2 REPLIES 2
M_Maldonado
Barite | Level 11

Hi Chemicalab,

Proc princomp and proc varclus are the go-to methods in Base SAS as you mention.

A different approach if you have access to SAS Enterprise Miner: try calculating the variable importance using a tree-based model node. Then confirm the variable importance of your variables.
Please note that these nodes have the variable selection option set to Yes by default. This means that if you connect any of these nodes to a Cluster node, you will pass only the most important variables (relative variable importance greater or equal to 0.05). A few considerations below.

  • Decision Tree node - variable importance is calculated using only one decision tree.
  • HPForest node - variable importance is calculated using a random forest model, which is more robust. This node is available in SAS Enterprise Miner 12.3 or newer.
  • Gradient Boosting node - it is very robust, but the sequential nature of this algorithm makes it take some time to run.


I hope it helps,
Thanks,

Miguel

chemicalab
Fluorite | Level 6

Unfortunately i dont have EM so i guess i will have to go with Proc Princ or Varclus, thank you for the reply

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

How to choose a machine learning algorithm

Use this tutorial as a handy guide to weigh the pros and cons of these commonly used machine learning algorithms.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 2 replies
  • 974 views
  • 0 likes
  • 2 in conversation