<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: Using standadized data for random forest and decision trees in SAS Data Science</title>
    <link>https://communities.sas.com/t5/SAS-Data-Science/Using-standadized-data-for-random-forest-and-decision-trees/m-p/361834#M5376</link>
    <description>&lt;P&gt;Here is a really helpful article about standardizing that was just posted:&lt;/P&gt;
&lt;P&gt;&lt;A href="https://communities.sas.com/t5/SAS-Communities-Library/To-standardize-data-or-not-to-standardize-data-that-is-the/ta-p/361726" target="_self"&gt;https://communities.sas.com/t5/SAS-Communities-Library/To-standardize-data-or-not-to-standardize-data-that-is-the/ta-p/361726&lt;/A&gt;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
    <pubDate>Fri, 26 May 2017 00:41:58 GMT</pubDate>
    <dc:creator>WendyCzika</dc:creator>
    <dc:date>2017-05-26T00:41:58Z</dc:date>
    <item>
      <title>Using standadized data for random forest and decision trees</title>
      <link>https://communities.sas.com/t5/SAS-Data-Science/Using-standadized-data-for-random-forest-and-decision-trees/m-p/361735#M5371</link>
      <description>&lt;P&gt;Hi,&lt;/P&gt;&lt;P&gt;I was wondering if anyone can help me?&lt;/P&gt;&lt;P&gt;Is this possible or right to use standardized data with random forest and decision trees?&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;If use with standardized data how algorithm treats that data?&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Regards&lt;/P&gt;</description>
      <pubDate>Thu, 25 May 2017 18:38:01 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Data-Science/Using-standadized-data-for-random-forest-and-decision-trees/m-p/361735#M5371</guid>
      <dc:creator>geniusgenie</dc:creator>
      <dc:date>2017-05-25T18:38:01Z</dc:date>
    </item>
    <item>
      <title>Re: Using standadized data for random forest and decision trees</title>
      <link>https://communities.sas.com/t5/SAS-Data-Science/Using-standadized-data-for-random-forest-and-decision-trees/m-p/361759#M5372</link>
      <description>&lt;P&gt;Typically, you do not have to standardize data (z-score) with tree models (decision tree, random forest, gradient boosting etc.) as the algorithm tries to split at a place where classification/prediction is the best based on some criteria. Also, when you standardize, the interpretability gets little harder -- instead of saying age &amp;gt; 25 years is a good split, you have to say age &amp;gt; 1 std dev away from the mean is a good split etc. So for tree based models, I say, when you don't need it, why do the extra work.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Hope this helps,&lt;/P&gt;
&lt;P&gt;Radhikha&lt;/P&gt;</description>
      <pubDate>Thu, 25 May 2017 20:00:40 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Data-Science/Using-standadized-data-for-random-forest-and-decision-trees/m-p/361759#M5372</guid>
      <dc:creator>RadhikhaMyneni</dc:creator>
      <dc:date>2017-05-25T20:00:40Z</dc:date>
    </item>
    <item>
      <title>Re: Using standadized data for random forest and decision trees</title>
      <link>https://communities.sas.com/t5/SAS-Data-Science/Using-standadized-data-for-random-forest-and-decision-trees/m-p/361811#M5375</link>
      <description>Thanks for your reply Radhikha , but if we have already standardized it. Is it going to make any difference or going to produce wrong results?</description>
      <pubDate>Thu, 25 May 2017 22:53:47 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Data-Science/Using-standadized-data-for-random-forest-and-decision-trees/m-p/361811#M5375</guid>
      <dc:creator>geniusgenie</dc:creator>
      <dc:date>2017-05-25T22:53:47Z</dc:date>
    </item>
    <item>
      <title>Re: Using standadized data for random forest and decision trees</title>
      <link>https://communities.sas.com/t5/SAS-Data-Science/Using-standadized-data-for-random-forest-and-decision-trees/m-p/361834#M5376</link>
      <description>&lt;P&gt;Here is a really helpful article about standardizing that was just posted:&lt;/P&gt;
&lt;P&gt;&lt;A href="https://communities.sas.com/t5/SAS-Communities-Library/To-standardize-data-or-not-to-standardize-data-that-is-the/ta-p/361726" target="_self"&gt;https://communities.sas.com/t5/SAS-Communities-Library/To-standardize-data-or-not-to-standardize-data-that-is-the/ta-p/361726&lt;/A&gt;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Fri, 26 May 2017 00:41:58 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Data-Science/Using-standadized-data-for-random-forest-and-decision-trees/m-p/361834#M5376</guid>
      <dc:creator>WendyCzika</dc:creator>
      <dc:date>2017-05-26T00:41:58Z</dc:date>
    </item>
    <item>
      <title>Re: Using standadized data for random forest and decision trees</title>
      <link>https://communities.sas.com/t5/SAS-Data-Science/Using-standadized-data-for-random-forest-and-decision-trees/m-p/361853#M5377</link>
      <description>Hi Wendy,&lt;BR /&gt;Thanks for sharing this wonderful article which really clarifies things but my question remains same if my dataset is already standardized would it be wrong to use random forest and decision tree or its ok.&lt;BR /&gt;&lt;BR /&gt;Plus, the dataset i have contains 5 colmuns which were difficult to be analysed without standardization for example all of them contain memory addresses like 0x00979876 etc. Only thing i did was to convert these 5 into decimal representation and standardized all of them.&lt;BR /&gt;&lt;BR /&gt;Other columns have normal categorical values.&lt;BR /&gt;&lt;BR /&gt;Hope this clarifies more.&lt;BR /&gt;&lt;BR /&gt;Regards</description>
      <pubDate>Fri, 26 May 2017 04:01:50 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Data-Science/Using-standadized-data-for-random-forest-and-decision-trees/m-p/361853#M5377</guid>
      <dc:creator>geniusgenie</dc:creator>
      <dc:date>2017-05-26T04:01:50Z</dc:date>
    </item>
    <item>
      <title>Re: Using standadized data for random forest and decision trees</title>
      <link>https://communities.sas.com/t5/SAS-Data-Science/Using-standadized-data-for-random-forest-and-decision-trees/m-p/361954#M5378</link>
      <description>&lt;P&gt;I don't think it hurts that you have already standardized. &amp;nbsp;With the tree-based algorithms, interval inputs are typically binned anyway (using bucket or equal-spaced binnin) before doing the split search, so it should be fine.&lt;/P&gt;</description>
      <pubDate>Fri, 26 May 2017 13:44:36 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Data-Science/Using-standadized-data-for-random-forest-and-decision-trees/m-p/361954#M5378</guid>
      <dc:creator>WendyCzika</dc:creator>
      <dc:date>2017-05-26T13:44:36Z</dc:date>
    </item>
    <item>
      <title>Re: Using standadized data for random forest and decision trees</title>
      <link>https://communities.sas.com/t5/SAS-Data-Science/Using-standadized-data-for-random-forest-and-decision-trees/m-p/362154#M5386</link>
      <description>Thanks a lot..</description>
      <pubDate>Sat, 27 May 2017 05:06:16 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Data-Science/Using-standadized-data-for-random-forest-and-decision-trees/m-p/362154#M5386</guid>
      <dc:creator>ali1067</dc:creator>
      <dc:date>2017-05-27T05:06:16Z</dc:date>
    </item>
  </channel>
</rss>

