Building models with SAS Enterprise Miner, SAS Factory Miner, SAS Visual Data Mining and Machine Learning or just with programming

Appropriate sample size for decision trees?

Accepted Solution Solved
Reply
New Contributor
Posts: 4
Accepted Solution

Appropriate sample size for decision trees?

I'm doing a decision tree assignment for class.  The data set has 131 records.  When I partition the data (50-50-0) I do not get a decision tree.  (There is just one node with no branches/leaves)  However, if I run the decision tree without partitioning, I get a tree with three branches.  I suspect that sample size may be an issue.

 

Can anyone confirm and point me in the direction of a reading on the subject?


Accepted Solutions
Solution
‎07-07-2017 03:08 PM
SAS Super FREQ
Posts: 272

Re: Appropriate sample size for decision trees?

You can try tweaking some of the options for growing the tree to be less restrictive: for example, lowering the values for the properties Minimum Categorical Size or Leaf Size, or raising the value for Significance Level if using the ProbF or ProbChisq splitting criteria.

View solution in original post


All Replies
Super User
Posts: 17,868

Re: Appropriate sample size for decision trees?

I would agree with the sample size issue. 

I'm not familiar enough with decision tree's to refer you to anything, but in regression a quick rule of thumb is 20 cases per predictor. That would be the equivalent of 20 cases per node.  However, if your data is partitioned into small groups you're also more likely to get extreme cases where all of a single value may be in your test or modeling data set. 

Solution
‎07-07-2017 03:08 PM
SAS Super FREQ
Posts: 272

Re: Appropriate sample size for decision trees?

You can try tweaking some of the options for growing the tree to be less restrictive: for example, lowering the values for the properties Minimum Categorical Size or Leaf Size, or raising the value for Significance Level if using the ProbF or ProbChisq splitting criteria.

☑ This topic is SOLVED.

Need further help from the community? Please ask a new question.

Discussion stats
  • 2 replies
  • 818 views
  • 0 likes
  • 3 in conversation