BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
maculdes
Fluorite | Level 6

 

hello,

 

How many leaves and nodes should a tree have in my model decision tree??

 

What would be the best of leaves

 

thank.

!

1 ACCEPTED SOLUTION

Accepted Solutions
JasonXin
SAS Employee

Hi, 

 

If you are using SAS software like Enterprise Miner or HPSLIT, default settings on these parameters, more often than not, serve you a fairly good baseline deccision tree model. 

 

In the case of Enterprise Miner where you can do what we call interactive tree, you can inject any variable based rules to stop, expand or prune a tree. You can also combine this kind of  'manual' tree with machine built trees. Machine trees are trees most predictive modelers mean when they talk about decision tree modeling. I believe your question is  about machine-built tree (DT). 

 

Best this, best that, the key is one word: validation. Where to stop, how many trees, how many variables to try (in other words, if you have 500K variables, it is not good idea to pump them all in at once to the tree engine), pruning guidance, surrogates... should all be decided on hold-out samples. As for deciding criteria (which I believe is what you are asking, literally), cost-complexity, balance between training and validation, outweighs so-called accuracy. Best practice typically involves rounds of rounds of tweaking. 

 

In the latest and the great SAS Viya ML suite, you have access to a facility called Auto Tuning that allows you to set ranges on (hyper) parameters, like those mentioned in your question, and let Viya tell which are the optimal combination. The search routine goes beyond brute force nature of grid search (Latin Hypercube, anyone). It is directly and immediately scalabe for the modeler to run it against huge data set in-memory. 

 

So what is the best of this and that? Go to work. Decision tree is unlike many other methods and algorithms. In many cases, the best is when you see it, like gardening.  Because there is a visual tree for you to see. 

 

Hope this  helps? 

 

Best Regards

Jason Xin

View solution in original post

9 REPLIES 9
paulkaefer
Lapis Lazuli | Level 10

This is a tough question to answer. In reality, there is no best answer. A good place to start is Wikipedia. From the article on decision trees: "An optimal decision tree is then defined as a tree that accounts for most of the data, while minimizing the number of levels (or 'questions')." The section entitled Association rule induction links to two references to learn more about this, and about algorithms to determine the best decision tree for your data.

JasonXin
SAS Employee

Hi, 

 

If you are using SAS software like Enterprise Miner or HPSLIT, default settings on these parameters, more often than not, serve you a fairly good baseline deccision tree model. 

 

In the case of Enterprise Miner where you can do what we call interactive tree, you can inject any variable based rules to stop, expand or prune a tree. You can also combine this kind of  'manual' tree with machine built trees. Machine trees are trees most predictive modelers mean when they talk about decision tree modeling. I believe your question is  about machine-built tree (DT). 

 

Best this, best that, the key is one word: validation. Where to stop, how many trees, how many variables to try (in other words, if you have 500K variables, it is not good idea to pump them all in at once to the tree engine), pruning guidance, surrogates... should all be decided on hold-out samples. As for deciding criteria (which I believe is what you are asking, literally), cost-complexity, balance between training and validation, outweighs so-called accuracy. Best practice typically involves rounds of rounds of tweaking. 

 

In the latest and the great SAS Viya ML suite, you have access to a facility called Auto Tuning that allows you to set ranges on (hyper) parameters, like those mentioned in your question, and let Viya tell which are the optimal combination. The search routine goes beyond brute force nature of grid search (Latin Hypercube, anyone). It is directly and immediately scalabe for the modeler to run it against huge data set in-memory. 

 

So what is the best of this and that? Go to work. Decision tree is unlike many other methods and algorithms. In many cases, the best is when you see it, like gardening.  Because there is a visual tree for you to see. 

 

Hope this  helps? 

 

Best Regards

Jason Xin

maculdes
Fluorite | Level 6

thanks for your help.
But I have seen on the web that they do an interactive configuration, but they do not show the steps to follow.
How could I do it?
I attached the image of what I want to do.

thanks


tree.jpg
JasonXin
SAS Employee
Hi, If you don't see the button to the right to click, open and configure, the chance is the image belongs to a different version of EM. Thanks. Jason Xin
maculdes
Fluorite | Level 6

Hello,

the version of tha imagen is the march 2014. It's what I see in the papper. but in my PC the version es sas miner 13.1.

and not see.

But I do not see or know how to configure the parameters.

 

thank

JasonXin
SAS Employee
If possible, could you point me to the paper? Thanks.
Jason Xin
maculdes
Fluorite | Level 6

Hello Jason,

 

this is a  paper http://digital.bl.fcen.uba.ar/Download/Tesis/Tesis_5612_Padua.pdf

view page number 56 and 57.

 

I want to modify my tree as shown in the figure on page 57. You can put intervals in the parameters

 

thank !!!

 

M_Maldonado
Barite | Level 11

Hey--if you haven't, check out the Getting Started with EM series.

In this one Chip Robie talks about decision trees and you will see a brief screensots of the Interactive Decision trees, starting at 9:15.

Good luck!

-Miguel

 

https://youtu.be/IlUZYlgkeSc?t=9m15s

maculdes
Fluorite | Level 6
Thank Miguel!!

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

How to choose a machine learning algorithm

Use this tutorial as a handy guide to weigh the pros and cons of these commonly used machine learning algorithms.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 9 replies
  • 7210 views
  • 3 likes
  • 4 in conversation