<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: Appropriate sample size for decision trees? in SAS Data Science</title>
    <link>https://communities.sas.com/t5/SAS-Data-Science/Appropriate-sample-size-for-decision-trees/m-p/256665#M3798</link>
    <description>&lt;P&gt;I would agree with the sample size issue.&amp;nbsp;&lt;/P&gt;
&lt;P&gt;I'm not familiar enough with decision tree's to refer you to anything, but in regression a quick rule of thumb is&amp;nbsp;20 cases per predictor. That would be the equivalent of 20 cases per node. &amp;nbsp;However, if your data is partitioned into small groups you're also more likely to get extreme cases where all of a single value may be in your test or modeling data set.&amp;nbsp;&lt;/P&gt;</description>
    <pubDate>Tue, 15 Mar 2016 01:03:11 GMT</pubDate>
    <dc:creator>Reeza</dc:creator>
    <dc:date>2016-03-15T01:03:11Z</dc:date>
    <item>
      <title>Appropriate sample size for decision trees?</title>
      <link>https://communities.sas.com/t5/SAS-Data-Science/Appropriate-sample-size-for-decision-trees/m-p/256662#M3797</link>
      <description>&lt;P&gt;I'm doing a decision tree assignment for class.&amp;nbsp; The data set has 131 records.&amp;nbsp; When I partition the data (50-50-0) I do not get a decision tree.&amp;nbsp; (There is just one node with no branches/leaves)&amp;nbsp; However, if I run the decision tree without partitioning, I get a tree with three branches.&amp;nbsp; I suspect that sample size may be an issue.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Can anyone confirm and point me in the direction of a reading on the subject?&lt;/P&gt;</description>
      <pubDate>Tue, 15 Mar 2016 00:19:51 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Data-Science/Appropriate-sample-size-for-decision-trees/m-p/256662#M3797</guid>
      <dc:creator>dwmccloskey</dc:creator>
      <dc:date>2016-03-15T00:19:51Z</dc:date>
    </item>
    <item>
      <title>Re: Appropriate sample size for decision trees?</title>
      <link>https://communities.sas.com/t5/SAS-Data-Science/Appropriate-sample-size-for-decision-trees/m-p/256665#M3798</link>
      <description>&lt;P&gt;I would agree with the sample size issue.&amp;nbsp;&lt;/P&gt;
&lt;P&gt;I'm not familiar enough with decision tree's to refer you to anything, but in regression a quick rule of thumb is&amp;nbsp;20 cases per predictor. That would be the equivalent of 20 cases per node. &amp;nbsp;However, if your data is partitioned into small groups you're also more likely to get extreme cases where all of a single value may be in your test or modeling data set.&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Tue, 15 Mar 2016 01:03:11 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Data-Science/Appropriate-sample-size-for-decision-trees/m-p/256665#M3798</guid>
      <dc:creator>Reeza</dc:creator>
      <dc:date>2016-03-15T01:03:11Z</dc:date>
    </item>
    <item>
      <title>Re: Appropriate sample size for decision trees?</title>
      <link>https://communities.sas.com/t5/SAS-Data-Science/Appropriate-sample-size-for-decision-trees/m-p/256834#M3802</link>
      <description>&lt;P&gt;You can try tweaking some of the options for growing the tree to be less restrictive: for example, lowering the values for the properties &lt;STRONG&gt;Minimum Categorical Size&lt;/STRONG&gt; or &lt;STRONG&gt;Leaf Size&lt;/STRONG&gt;, or raising the value for&amp;nbsp;&lt;STRONG&gt;Significance Level&lt;/STRONG&gt; if using the ProbF or ProbChisq splitting criteria.&lt;/P&gt;</description>
      <pubDate>Tue, 15 Mar 2016 16:26:09 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Data-Science/Appropriate-sample-size-for-decision-trees/m-p/256834#M3802</guid>
      <dc:creator>WendyCzika</dc:creator>
      <dc:date>2016-03-15T16:26:09Z</dc:date>
    </item>
  </channel>
</rss>

