I'm doing a decision tree assignment for class. The data set has 131 records. When I partition the data (50-50-0) I do not get a decision tree. (There is just one node with no branches/leaves) However, if I run the decision tree without partitioning, I get a tree with three branches. I suspect that sample size may be an issue.
Can anyone confirm and point me in the direction of a reading on the subject?
You can try tweaking some of the options for growing the tree to be less restrictive: for example, lowering the values for the properties Minimum Categorical Size or Leaf Size, or raising the value for Significance Level if using the ProbF or ProbChisq splitting criteria.
I would agree with the sample size issue.
I'm not familiar enough with decision tree's to refer you to anything, but in regression a quick rule of thumb is 20 cases per predictor. That would be the equivalent of 20 cases per node. However, if your data is partitioned into small groups you're also more likely to get extreme cases where all of a single value may be in your test or modeling data set.
You can try tweaking some of the options for growing the tree to be less restrictive: for example, lowering the values for the properties Minimum Categorical Size or Leaf Size, or raising the value for Significance Level if using the ProbF or ProbChisq splitting criteria.
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Use this tutorial as a handy guide to weigh the pros and cons of these commonly used machine learning algorithms.
Find more tutorials on the SAS Users YouTube channel.