<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: New predictors decrease model accuracy in SAS Enterprise Guide</title>
    <link>https://communities.sas.com/t5/SAS-Enterprise-Guide/New-predictors-decrease-model-accuracy/m-p/245608#M17459</link>
    <description>&lt;P&gt;Your basic instinct "always add more data!" &amp;nbsp;is a good one. And you're sort of correct, that adding more data can't possibly make your situation worse because after all, you still have the original data, plus some more!&amp;nbsp; However - remember that RPM is just a machine. You are still trusting it to identify problems with the new data, and then back off. Remember that as you add more data, you might create a situation where your model is overfitting - meaning the model fits the training&amp;nbsp;data better, but fits the holdout data worse.&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Now, if you were building the model by hand, you might notice that and say, "oh let me go back to my original model"&amp;nbsp;But RPM will never be as smart as a human. To answer your question would&amp;nbsp;require intimiate knowledge of your data and also how RPM works, of which I have neither.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Its hard to say exactly why your situation yielded a lower Gini. There are an infinite number of things that could have happened, but here is my best guess:&lt;/P&gt;
&lt;P&gt;&lt;BR /&gt;Either:&lt;/P&gt;
&lt;P&gt;(a) &amp;nbsp;The model selection isn't using Gini but some other criteria &amp;nbsp;(perhaps the 2nd model produced an almost-as-good-gini with less variables and had a higher AIC)&lt;/P&gt;
&lt;P&gt;(b) You added new variables which were all colinear with each other, but RPM only took the top N best. The new variables bumped off the original variables outside of the top N, so they weren't even considered in the 2nd model.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;I'm sorry that I'm not an expert in RPM and can't answer your question directly, but I think you're on the right path. Youre adding more data, running models, and paying attention to details.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Good luck&lt;/P&gt;</description>
    <pubDate>Sat, 23 Jan 2016 05:04:29 GMT</pubDate>
    <dc:creator>JBerry</dc:creator>
    <dc:date>2016-01-23T05:04:29Z</dc:date>
    <item>
      <title>New predictors decrease model accuracy</title>
      <link>https://communities.sas.com/t5/SAS-Enterprise-Guide/New-predictors-decrease-model-accuracy/m-p/238362#M17160</link>
      <description>&lt;P&gt;Good day to everyone.&lt;/P&gt;&lt;P&gt;I'am using SAS RPM for modeling and got next problem:&lt;/P&gt;&lt;P&gt;I've builded model on some pull of predictors with Gini, say 0.5&lt;/P&gt;&lt;P&gt;Then I took same data, same predictors and added &amp;nbsp;new set of parameters, but unfortunately model became worse(lower gini).&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;In my understanding, if new predictor is "weak" and it can't improve model performance, it at least won't spoil it.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Does someone have any ideas?&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Tue, 08 Dec 2015 19:00:08 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Enterprise-Guide/New-predictors-decrease-model-accuracy/m-p/238362#M17160</guid>
      <dc:creator>ilya_1991</dc:creator>
      <dc:date>2015-12-08T19:00:08Z</dc:date>
    </item>
    <item>
      <title>Re: New predictors decrease model accuracy</title>
      <link>https://communities.sas.com/t5/SAS-Enterprise-Guide/New-predictors-decrease-model-accuracy/m-p/245608#M17459</link>
      <description>&lt;P&gt;Your basic instinct "always add more data!" &amp;nbsp;is a good one. And you're sort of correct, that adding more data can't possibly make your situation worse because after all, you still have the original data, plus some more!&amp;nbsp; However - remember that RPM is just a machine. You are still trusting it to identify problems with the new data, and then back off. Remember that as you add more data, you might create a situation where your model is overfitting - meaning the model fits the training&amp;nbsp;data better, but fits the holdout data worse.&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Now, if you were building the model by hand, you might notice that and say, "oh let me go back to my original model"&amp;nbsp;But RPM will never be as smart as a human. To answer your question would&amp;nbsp;require intimiate knowledge of your data and also how RPM works, of which I have neither.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Its hard to say exactly why your situation yielded a lower Gini. There are an infinite number of things that could have happened, but here is my best guess:&lt;/P&gt;
&lt;P&gt;&lt;BR /&gt;Either:&lt;/P&gt;
&lt;P&gt;(a) &amp;nbsp;The model selection isn't using Gini but some other criteria &amp;nbsp;(perhaps the 2nd model produced an almost-as-good-gini with less variables and had a higher AIC)&lt;/P&gt;
&lt;P&gt;(b) You added new variables which were all colinear with each other, but RPM only took the top N best. The new variables bumped off the original variables outside of the top N, so they weren't even considered in the 2nd model.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;I'm sorry that I'm not an expert in RPM and can't answer your question directly, but I think you're on the right path. Youre adding more data, running models, and paying attention to details.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Good luck&lt;/P&gt;</description>
      <pubDate>Sat, 23 Jan 2016 05:04:29 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Enterprise-Guide/New-predictors-decrease-model-accuracy/m-p/245608#M17459</guid>
      <dc:creator>JBerry</dc:creator>
      <dc:date>2016-01-23T05:04:29Z</dc:date>
    </item>
    <item>
      <title>Re: New predictors decrease model accuracy</title>
      <link>https://communities.sas.com/t5/SAS-Enterprise-Guide/New-predictors-decrease-model-accuracy/m-p/246739#M17507</link>
      <description>Thank you very much for your answer&lt;BR /&gt;</description>
      <pubDate>Thu, 28 Jan 2016 20:20:39 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Enterprise-Guide/New-predictors-decrease-model-accuracy/m-p/246739#M17507</guid>
      <dc:creator>ilya_1991</dc:creator>
      <dc:date>2016-01-28T20:20:39Z</dc:date>
    </item>
  </channel>
</rss>

