SAS Data Science

Building models with SAS Enterprise Miner, SAS Factory Miner, SAS Viya (Machine Learning), SAS Visual Text Analytics, with point-and-click interfaces or programming
BookmarkSubscribeRSS Feed
msf2021
Fluorite | Level 6

Hello!

 

I have built a random forest in SAS Miner for classification task. I have the variable Target (1=event, 0= non event) and i came along with top 20 variables more important. After that, i chose just this 20 and run again HPForest node, and all my metrics are ok between train (split 80%) and test (split 20%) but cumulative % captured response is significantly different between train (~30% in 1st decile) and test (~20% in 1st decile). I found that changing some parameters like mtry and maximum number of trees changes these results but is there a way i can find which are the optimal parameters? Trying different combinations by hand is not easy and I am not able to achieve good results.

 

I used already this methodology: Tip: Getting the Most from your Random Forest - SAS Support Communities but first it only considers interval inputs and i have interval and categorical ones and also, i cannot achieve better results with this approach...

 

Thanks

1 REPLY 1
sbxkoenk
SAS Super FREQ

Hello @msf2021 ,

 

What is the variable importance table / importance plot telling you?

 

Maybe the top 20 variables are only responsible for 50% of the total importance?

 

You can also have a look here :

SAS Tutorial | How to train forest models in SAS?
https://www.youtube.com/watch?v=FWragzNF59U

 

SAS Tutorial | How to Pick Hyperparameters of Machine Learning Models?

https://www.youtube.com/watch?v=AOR7XnCB_JA

 

You can also select the most important variables upfront with other techniques.

Not sure if the PROC VARREDUCE was already available in Enterprise Miner times(?).

 

Thanks,

Koen

sas-innovate-white.png

Our biggest data and AI event of the year.

Don’t miss the livestream kicking off May 7. It’s free. It’s easy. And it’s the best seat in the house.

Join us virtually with our complimentary SAS Innovate Digital Pass. Watch live or on-demand in multiple languages, with translations available to help you get the most out of every session.

 

Register now!

How to choose a machine learning algorithm

Use this tutorial as a handy guide to weigh the pros and cons of these commonly used machine learning algorithms.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 1 reply
  • 4280 views
  • 0 likes
  • 2 in conversation