BookmarkSubscribeRSS Feed
EricTsai
Calcite | Level 5

Hi,

Does anyone know how to use random forest in Enterprise Miner to do variable selection?  I have large number of original variables to start with ranging from 7,000 to 8,000.

I have tried using Variable Clustering Node in EMiner but would love to learn more ways of variable selection for large data set.

Thanks a lot.

2 REPLIES 2
M_Maldonado
Barite | Level 11

Hi Yi-Chun,

By default your HPForest node has the property Variable Selection set to Yes. All you need to do is run an HPForest node, and then connect any other node. The the variables that have an out-of-bag reduction less than or equal to zero get rejected after your HPForest.

good luck!

miguel

EricTsai
Calcite | Level 5

Hi, Miguel:

In your experience, if you have thousands of potential independent variables to start with in your modeling process, what would you try to reduce this huge number to a more manageable list?

Have you seen any method that give you a more predictive model at the end?

What is the state of art in this area right now at SAS?

sas-innovate-2024.png

Don't miss out on SAS Innovate - Register now for the FREE Livestream!

Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.

 

Register now!

How to choose a machine learning algorithm

Use this tutorial as a handy guide to weigh the pros and cons of these commonly used machine learning algorithms.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 2 replies
  • 1489 views
  • 0 likes
  • 2 in conversation