BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
dwmccloskey
Calcite | Level 5

I'm doing a decision tree assignment for class.  The data set has 131 records.  When I partition the data (50-50-0) I do not get a decision tree.  (There is just one node with no branches/leaves)  However, if I run the decision tree without partitioning, I get a tree with three branches.  I suspect that sample size may be an issue.

 

Can anyone confirm and point me in the direction of a reading on the subject?

1 ACCEPTED SOLUTION

Accepted Solutions
WendyCzika
SAS Employee

You can try tweaking some of the options for growing the tree to be less restrictive: for example, lowering the values for the properties Minimum Categorical Size or Leaf Size, or raising the value for Significance Level if using the ProbF or ProbChisq splitting criteria.

View solution in original post

2 REPLIES 2
Reeza
Super User

I would agree with the sample size issue. 

I'm not familiar enough with decision tree's to refer you to anything, but in regression a quick rule of thumb is 20 cases per predictor. That would be the equivalent of 20 cases per node.  However, if your data is partitioned into small groups you're also more likely to get extreme cases where all of a single value may be in your test or modeling data set. 

WendyCzika
SAS Employee

You can try tweaking some of the options for growing the tree to be less restrictive: for example, lowering the values for the properties Minimum Categorical Size or Leaf Size, or raising the value for Significance Level if using the ProbF or ProbChisq splitting criteria.

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

How to choose a machine learning algorithm

Use this tutorial as a handy guide to weigh the pros and cons of these commonly used machine learning algorithms.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 2 replies
  • 9718 views
  • 0 likes
  • 3 in conversation