<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: Forward selection of statistically insignificant variables in SAS Data Science</title>
    <link>https://communities.sas.com/t5/SAS-Data-Science/Forward-selection-of-statistically-insignificant-variables/m-p/406315#M6195</link>
    <description>To sum up, the test level is .05 but the p-values for the categories are around .93.</description>
    <pubDate>Sat, 21 Oct 2017 21:14:31 GMT</pubDate>
    <dc:creator>mmaccora</dc:creator>
    <dc:date>2017-10-21T21:14:31Z</dc:date>
    <item>
      <title>Forward selection of statistically insignificant variables</title>
      <link>https://communities.sas.com/t5/SAS-Data-Science/Forward-selection-of-statistically-insignificant-variables/m-p/406308#M6190</link>
      <description>Hi,&lt;BR /&gt;&lt;BR /&gt;In SAS Enterprise Miner, I trained a logistic regression with forward selection and AIC criteria.&lt;BR /&gt;&lt;BR /&gt;I grouped rare levels for categorical variables. One of them was selected by the algorithm but the coefficients of all categories were not statistically significant (different from 0).&lt;BR /&gt;&lt;BR /&gt;Why the algorithm would select such a variable if all categories are not significant ? Does someone know a scientific explanation ?&lt;BR /&gt;&lt;BR /&gt;Thank you for your help,&lt;BR /&gt;Marco</description>
      <pubDate>Sat, 21 Oct 2017 20:29:07 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Data-Science/Forward-selection-of-statistically-insignificant-variables/m-p/406308#M6190</guid>
      <dc:creator>mmaccora</dc:creator>
      <dc:date>2017-10-21T20:29:07Z</dc:date>
    </item>
    <item>
      <title>Re: Forward selection of statistically insignificant variables</title>
      <link>https://communities.sas.com/t5/SAS-Data-Science/Forward-selection-of-statistically-insignificant-variables/m-p/406311#M6191</link>
      <description>&lt;P&gt;What were the p-values?&lt;/P&gt;</description>
      <pubDate>Sat, 21 Oct 2017 20:49:20 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Data-Science/Forward-selection-of-statistically-insignificant-variables/m-p/406311#M6191</guid>
      <dc:creator>Reeza</dc:creator>
      <dc:date>2017-10-21T20:49:20Z</dc:date>
    </item>
    <item>
      <title>Re: Forward selection of statistically insignificant variables</title>
      <link>https://communities.sas.com/t5/SAS-Data-Science/Forward-selection-of-statistically-insignificant-variables/m-p/406312#M6192</link>
      <description>P-value = .05</description>
      <pubDate>Sat, 21 Oct 2017 21:10:09 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Data-Science/Forward-selection-of-statistically-insignificant-variables/m-p/406312#M6192</guid>
      <dc:creator>mmaccora</dc:creator>
      <dc:date>2017-10-21T21:10:09Z</dc:date>
    </item>
    <item>
      <title>Re: Forward selection of statistically insignificant variables</title>
      <link>https://communities.sas.com/t5/SAS-Data-Science/Forward-selection-of-statistically-insignificant-variables/m-p/406313#M6193</link>
      <description>&lt;P&gt;Then I suspect the p-values are less than the p-value cut off that was specified in the restrictions.&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Sat, 21 Oct 2017 21:11:59 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Data-Science/Forward-selection-of-statistically-insignificant-variables/m-p/406313#M6193</guid>
      <dc:creator>Reeza</dc:creator>
      <dc:date>2017-10-21T21:11:59Z</dc:date>
    </item>
    <item>
      <title>Re: Forward selection of statistically insignificant variables</title>
      <link>https://communities.sas.com/t5/SAS-Data-Science/Forward-selection-of-statistically-insignificant-variables/m-p/406314#M6194</link>
      <description>The p-values for the categories were around .93</description>
      <pubDate>Sat, 21 Oct 2017 21:12:19 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Data-Science/Forward-selection-of-statistically-insignificant-variables/m-p/406314#M6194</guid>
      <dc:creator>mmaccora</dc:creator>
      <dc:date>2017-10-21T21:12:19Z</dc:date>
    </item>
    <item>
      <title>Re: Forward selection of statistically insignificant variables</title>
      <link>https://communities.sas.com/t5/SAS-Data-Science/Forward-selection-of-statistically-insignificant-variables/m-p/406315#M6195</link>
      <description>To sum up, the test level is .05 but the p-values for the categories are around .93.</description>
      <pubDate>Sat, 21 Oct 2017 21:14:31 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Data-Science/Forward-selection-of-statistically-insignificant-variables/m-p/406315#M6195</guid>
      <dc:creator>mmaccora</dc:creator>
      <dc:date>2017-10-21T21:14:31Z</dc:date>
    </item>
    <item>
      <title>Re: Forward selection of statistically insignificant variables</title>
      <link>https://communities.sas.com/t5/SAS-Data-Science/Forward-selection-of-statistically-insignificant-variables/m-p/406316#M6196</link>
      <description>&lt;P&gt;You said earlier it was 0.05?&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;If you can post the parameter estimates table that may help.&lt;/P&gt;
&lt;P&gt;But it may be that was the model with the best AIC so the significance of the variables aren't considered. Remember the cutoff of 0.05 is an arbitrary measure, but I'm surprised that p-values of 0.93 would include a categorical variable.&lt;/P&gt;</description>
      <pubDate>Sat, 21 Oct 2017 21:15:32 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Data-Science/Forward-selection-of-statistically-insignificant-variables/m-p/406316#M6196</guid>
      <dc:creator>Reeza</dc:creator>
      <dc:date>2017-10-21T21:15:32Z</dc:date>
    </item>
    <item>
      <title>Re: Forward selection of statistically insignificant variables</title>
      <link>https://communities.sas.com/t5/SAS-Data-Science/Forward-selection-of-statistically-insignificant-variables/m-p/406320#M6197</link>
      <description>&lt;BLOCKQUOTE&gt;&lt;HR /&gt;&lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/82618"&gt;@mmaccora&lt;/a&gt; wrote:&lt;BR /&gt;Hi,&lt;BR /&gt;&lt;BR /&gt;In SAS Enterprise Miner, I trained a logistic regression with forward selection and AIC criteria.&lt;BR /&gt;&lt;BR /&gt;I grouped rare levels for categorical variables. One of them was selected by the algorithm but the coefficients of all categories were not statistically significant (different from 0).&lt;BR /&gt;&lt;BR /&gt;Why the algorithm would select such a variable if all categories are not significant ? Does someone know a scientific explanation ?&lt;BR /&gt;&lt;BR /&gt;Thank you for your help,&lt;BR /&gt;Marco&lt;HR /&gt;&lt;/BLOCKQUOTE&gt;
&lt;P&gt;The stepwise (or forward selection) algorithm selects a variable (or variables) that improve the model the most. This is not the same, and is completely unrelated to, the coefficients of all categories being significant.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;But as long as I'm explaining, I will also explain that forward selection (and all stepwise algorithms) are highly discredited methods; they work poorly and don't have as good results as other methods. If you go to your favorite search engine and search for "problems with stepwise regression", you can read so much material on this subject that you won't be done until 2018.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Advice: just because you can use forward selection doesn't mean you should use forward selection. A technique that produces better model (better mean smaller root mean square error of predicted values, and smaller root mean square error of the model coefficients) is Partial Least Squares regression. Reference: &lt;A href="http://asq.org/qic/display-item/index.html?item=13552" target="_blank"&gt;http://asq.org/qic/display-item/index.html?item=13552&lt;/A&gt;&lt;/P&gt;</description>
      <pubDate>Sat, 21 Oct 2017 21:57:33 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Data-Science/Forward-selection-of-statistically-insignificant-variables/m-p/406320#M6197</guid>
      <dc:creator>PaigeMiller</dc:creator>
      <dc:date>2017-10-21T21:57:33Z</dc:date>
    </item>
  </channel>
</rss>

