hello,
How many leaves and nodes should a tree have in my model decision tree??
What would be the best of leaves
thank.
!
Hi,
If you are using SAS software like Enterprise Miner or HPSLIT, default settings on these parameters, more often than not, serve you a fairly good baseline deccision tree model.
In the case of Enterprise Miner where you can do what we call interactive tree, you can inject any variable based rules to stop, expand or prune a tree. You can also combine this kind of 'manual' tree with machine built trees. Machine trees are trees most predictive modelers mean when they talk about decision tree modeling. I believe your question is about machine-built tree (DT).
Best this, best that, the key is one word: validation. Where to stop, how many trees, how many variables to try (in other words, if you have 500K variables, it is not good idea to pump them all in at once to the tree engine), pruning guidance, surrogates... should all be decided on hold-out samples. As for deciding criteria (which I believe is what you are asking, literally), cost-complexity, balance between training and validation, outweighs so-called accuracy. Best practice typically involves rounds of rounds of tweaking.
In the latest and the great SAS Viya ML suite, you have access to a facility called Auto Tuning that allows you to set ranges on (hyper) parameters, like those mentioned in your question, and let Viya tell which are the optimal combination. The search routine goes beyond brute force nature of grid search (Latin Hypercube, anyone). It is directly and immediately scalabe for the modeler to run it against huge data set in-memory.
So what is the best of this and that? Go to work. Decision tree is unlike many other methods and algorithms. In many cases, the best is when you see it, like gardening. Because there is a visual tree for you to see.
Hope this helps?
Best Regards
Jason Xin
This is a tough question to answer. In reality, there is no best answer. A good place to start is Wikipedia. From the article on decision trees: "An optimal decision tree is then defined as a tree that accounts for most of the data, while minimizing the number of levels (or 'questions')." The section entitled Association rule induction links to two references to learn more about this, and about algorithms to determine the best decision tree for your data.
Hi,
If you are using SAS software like Enterprise Miner or HPSLIT, default settings on these parameters, more often than not, serve you a fairly good baseline deccision tree model.
In the case of Enterprise Miner where you can do what we call interactive tree, you can inject any variable based rules to stop, expand or prune a tree. You can also combine this kind of 'manual' tree with machine built trees. Machine trees are trees most predictive modelers mean when they talk about decision tree modeling. I believe your question is about machine-built tree (DT).
Best this, best that, the key is one word: validation. Where to stop, how many trees, how many variables to try (in other words, if you have 500K variables, it is not good idea to pump them all in at once to the tree engine), pruning guidance, surrogates... should all be decided on hold-out samples. As for deciding criteria (which I believe is what you are asking, literally), cost-complexity, balance between training and validation, outweighs so-called accuracy. Best practice typically involves rounds of rounds of tweaking.
In the latest and the great SAS Viya ML suite, you have access to a facility called Auto Tuning that allows you to set ranges on (hyper) parameters, like those mentioned in your question, and let Viya tell which are the optimal combination. The search routine goes beyond brute force nature of grid search (Latin Hypercube, anyone). It is directly and immediately scalabe for the modeler to run it against huge data set in-memory.
So what is the best of this and that? Go to work. Decision tree is unlike many other methods and algorithms. In many cases, the best is when you see it, like gardening. Because there is a visual tree for you to see.
Hope this helps?
Best Regards
Jason Xin
thanks for your help.
But I have seen on the web that they do an interactive configuration, but they do not show the steps to follow.
How could I do it?
I attached the image of what I want to do.
thanks
Hello,
the version of tha imagen is the march 2014. It's what I see in the papper. but in my PC the version es sas miner 13.1.
and not see.
But I do not see or know how to configure the parameters.
thank
Hello Jason,
this is a paper http://digital.bl.fcen.uba.ar/Download/Tesis/Tesis_5612_Padua.pdf
view page number 56 and 57.
I want to modify my tree as shown in the figure on page 57. You can put intervals in the parameters
thank !!!
Hey--if you haven't, check out the Getting Started with EM series.
In this one Chip Robie talks about decision trees and you will see a brief screensots of the Interactive Decision trees, starting at 9:15.
Good luck!
-Miguel
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Use this tutorial as a handy guide to weigh the pros and cons of these commonly used machine learning algorithms.
Find more tutorials on the SAS Users YouTube channel.