<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Random Forest in SAS Data Science</title>
    <link>https://communities.sas.com/t5/SAS-Data-Science/Random-Forest/m-p/502698#M7405</link>
    <description>&lt;P&gt;For Random Forest in SAS what percentage of test data is used for the out-of-bag data?&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;How does Random Forest in SAS handle missing values?&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Does a macro exist that will sweep thru parameters used in PROC HPFOREST?&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Thank you,&lt;/P&gt;&lt;P&gt;Ben DeKoven&lt;/P&gt;</description>
    <pubDate>Tue, 09 Oct 2018 14:09:38 GMT</pubDate>
    <dc:creator>BenjaminD</dc:creator>
    <dc:date>2018-10-09T14:09:38Z</dc:date>
    <item>
      <title>Random Forest</title>
      <link>https://communities.sas.com/t5/SAS-Data-Science/Random-Forest/m-p/502698#M7405</link>
      <description>&lt;P&gt;For Random Forest in SAS what percentage of test data is used for the out-of-bag data?&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;How does Random Forest in SAS handle missing values?&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Does a macro exist that will sweep thru parameters used in PROC HPFOREST?&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Thank you,&lt;/P&gt;&lt;P&gt;Ben DeKoven&lt;/P&gt;</description>
      <pubDate>Tue, 09 Oct 2018 14:09:38 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Data-Science/Random-Forest/m-p/502698#M7405</guid>
      <dc:creator>BenjaminD</dc:creator>
      <dc:date>2018-10-09T14:09:38Z</dc:date>
    </item>
    <item>
      <title>Re: Random Forest</title>
      <link>https://communities.sas.com/t5/SAS-Data-Science/Random-Forest/m-p/502807#M7408</link>
      <description>&lt;P&gt;Hey Ben - a few years ago I posted a &lt;A href="https://communities.sas.com/t5/SAS-Communities-Library/Tip-Getting-the-Most-from-your-Random-Forest/ta-p/223949" target="_self"&gt;tip about studying the hyperparameters of random forests and SVM&lt;/A&gt;. There is some macro code you may find useful in there for your own studies.&amp;nbsp; If you have SAS Viya, and SAS Visual Data Mining and Machine Learning in particular, you have access to a much better built-in mechanism for tuning the hyperparameters called autotuning, which uses optimization techniques to drive the exploration of the model configurations.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;You can control the in-bag-fraction (thus indirectly the out of bag fraction) using the INBAGFRACTION option for HPFOREST as answered in &lt;A href="https://communities.sas.com/t5/SAS-Data-Mining-and-Machine/HPforest-options-for-bagging/m-p/502790#M7407" target="_self"&gt;this post&lt;/A&gt;.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;For missing value handling, check out these options in&lt;A href="https://go.documentation.sas.com/?docsetId=emhpprcref&amp;amp;docsetTarget=emhpprcref_hpforest_syntax01.htm%3Flocale&amp;amp;docsetVersion=14.2&amp;amp;locale=en" target="_self"&gt; the doc&lt;/A&gt;. And a complete explanation of how they are handled is found &lt;A href="https://go.documentation.sas.com/?docsetId=emhpprcref&amp;amp;docsetTarget=emhpprcref_hpforest_details21.htm&amp;amp;docsetVersion=14.2&amp;amp;locale=en" target="_self"&gt;here&lt;/A&gt;.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&lt;SPAN class=" aa-term "&gt;MINCATSIZE=&lt;SPAN class=" aa-argument"&gt;n&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;/P&gt;
&lt;P&gt;specifies the minimum number of observations that a given nominal input category must have in order to use the category in a split search. Categorical values that appear in fewer than&lt;SPAN&gt;&amp;nbsp;&lt;/SPAN&gt;&lt;SPAN class=" aa-argument"&gt;n&lt;/SPAN&gt;&lt;SPAN&gt;&amp;nbsp;&lt;/SPAN&gt;observations are handled as if they were missing. The categories that occur in fewer than&lt;SPAN&gt;&amp;nbsp;&lt;/SPAN&gt;&lt;SPAN class=" aa-argument"&gt;n&lt;/SPAN&gt;&lt;SPAN&gt;&amp;nbsp;&lt;/SPAN&gt;observations are merged into the pseudo category for missing values for the purpose of finding a split. The policy for assigning such observations to a branch is the same as the policy for assigning missing values to a branch. The default value of&lt;SPAN&gt;&amp;nbsp;&lt;/SPAN&gt;&lt;SPAN class=" aa-argument"&gt;n&lt;/SPAN&gt;&lt;SPAN&gt;&amp;nbsp;&lt;/SPAN&gt;is 5.&lt;/P&gt;
&lt;P&gt;&lt;SPAN class=" aa-term "&gt;MINUSEINSEARCH=&lt;SPAN class=" aa-argument"&gt;n&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;/P&gt;
&lt;P&gt;specifies a threshold for utilizing missing values in the split search when&lt;SPAN&gt;&amp;nbsp;&lt;/SPAN&gt;&lt;A class="ng-scope" tabindex="0" href="https://go.documentation.sas.com/?docsetId=emhpprcref&amp;amp;docsetTarget=emhpprcref_hpforest_syntax01.htm&amp;amp;docsetVersion=14.2&amp;amp;locale=en#emhpprcref.hpforest.proc_missing" data-docset-id="emhpprcref" data-docset-version="14.2" data-original-href="emhpprcref_hpforest_syntax01.htm#emhpprcref.hpforest.proc_missing"&gt;MISSING=USEINSEARCH&lt;/A&gt;&lt;SPAN&gt;&amp;nbsp;&lt;/SPAN&gt;is specified as the missing value policy. If the number of observations in which the splitting variable has missing values in a node is greater than or equal to&lt;SPAN&gt;&amp;nbsp;&lt;/SPAN&gt;&lt;SPAN class=" aa-argument"&gt;n&lt;/SPAN&gt;, then PROC HPFOREST initiates the USEINSEARCH policy for missing values. See the section&lt;SPAN&gt;&amp;nbsp;&lt;/SPAN&gt;&lt;A class="ng-scope" tabindex="0" href="https://go.documentation.sas.com/?docsetId=emhpprcref&amp;amp;docsetTarget=emhpprcref_hpforest_details21.htm&amp;amp;docsetVersion=14.2&amp;amp;locale=en" data-docset-id="emhpprcref" data-docset-version="14.2" data-original-href="emhpprcref_hpforest_details21.htm"&gt;Handling Missing Values&lt;/A&gt;&lt;SPAN&gt;&amp;nbsp;&lt;/SPAN&gt;for a more complete explanation. The default value of&lt;SPAN&gt;&amp;nbsp;&lt;/SPAN&gt;&lt;SPAN class=" aa-argument"&gt;n&lt;/SPAN&gt;&lt;SPAN&gt;&amp;nbsp;&lt;/SPAN&gt;is 1.&lt;/P&gt;
&lt;P&gt;&lt;SPAN class=" aa-term "&gt;MISSING=USEINSEARCH&amp;nbsp;|&amp;nbsp;BIGBRANCH&lt;/SPAN&gt;&lt;/P&gt;
&lt;P&gt;specifies how the training procedure handles an observation with missing values. If MISSING=USEINSEARCH and the number of training observations in the node is more than&lt;SPAN&gt;&amp;nbsp;&lt;/SPAN&gt;&lt;SPAN class=" aa-argument"&gt;n&lt;/SPAN&gt;, where&lt;SPAN&gt;&amp;nbsp;&lt;/SPAN&gt;&lt;SPAN class=" aa-argument"&gt;n&lt;/SPAN&gt;&lt;SPAN&gt;&amp;nbsp;&lt;/SPAN&gt;is the value of the&lt;SPAN&gt;&amp;nbsp;&lt;/SPAN&gt;&lt;A class="ng-scope" tabindex="0" href="https://go.documentation.sas.com/?docsetId=emhpprcref&amp;amp;docsetTarget=emhpprcref_hpforest_syntax01.htm&amp;amp;docsetVersion=14.2&amp;amp;locale=en#emhpprcref.hpforest.proc_minuse" data-docset-id="emhpprcref" data-docset-version="14.2" data-original-href="emhpprcref_hpforest_syntax01.htm#emhpprcref.hpforest.proc_minuse"&gt;MINUSEINSEARCH=&lt;/A&gt;&lt;SPAN&gt;&amp;nbsp;&lt;/SPAN&gt;option, then the missing value is used as a separate, legitimate value in the test of association and the split search. If MISSING=BIGBRANCH, observations with a missing value of the candidate variable are omitted from the test of association and split search in that node. A splitting rule will assign such an observation to the branch containing the most observations among those used in the split search. See the section&lt;SPAN&gt;&amp;nbsp;&lt;/SPAN&gt;&lt;A class="ng-scope" tabindex="0" href="https://go.documentation.sas.com/?docsetId=emhpprcref&amp;amp;docsetTarget=emhpprcref_hpforest_details21.htm&amp;amp;docsetVersion=14.2&amp;amp;locale=en" data-docset-id="emhpprcref" data-docset-version="14.2" data-original-href="emhpprcref_hpforest_details21.htm"&gt;Handling Missing Values&lt;/A&gt;&lt;SPAN&gt;&amp;nbsp;&lt;/SPAN&gt;for a more complete explanation. By default, MISSING=USEINSEARCH.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Hope this helps.&lt;/P&gt;
&lt;P&gt;Brett&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Tue, 09 Oct 2018 16:49:25 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Data-Science/Random-Forest/m-p/502807#M7408</guid>
      <dc:creator>BrettWujek</dc:creator>
      <dc:date>2018-10-09T16:49:25Z</dc:date>
    </item>
    <item>
      <title>Re: Random Forest</title>
      <link>https://communities.sas.com/t5/SAS-Data-Science/Random-Forest/m-p/502822#M7409</link>
      <description>&lt;P&gt;Hello Brett,&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;This is very helpful information.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Thank you,&lt;/P&gt;&lt;P&gt;Ben DeKoven&lt;/P&gt;</description>
      <pubDate>Tue, 09 Oct 2018 17:41:38 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Data-Science/Random-Forest/m-p/502822#M7409</guid>
      <dc:creator>BenjaminD</dc:creator>
      <dc:date>2018-10-09T17:41:38Z</dc:date>
    </item>
  </channel>
</rss>

