<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: Best-subset instead of stepwise in SAS Procedures</title>
    <link>https://communities.sas.com/t5/SAS-Procedures/Best-subset-instead-of-stepwise/m-p/78296#M22601</link>
    <description>&lt;HTML&gt;&lt;HEAD&gt;&lt;/HEAD&gt;&lt;BODY&gt;&lt;P&gt;Peat,&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;Probably the best way to address overfitting is with Bootstrapping.&amp;nbsp; there is a substantial literature on it.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;Doc Muhlbaier&lt;/P&gt;&lt;P&gt;Duke&lt;/P&gt;&lt;/BODY&gt;&lt;/HTML&gt;</description>
    <pubDate>Mon, 15 Jul 2013 19:07:01 GMT</pubDate>
    <dc:creator>Doc_Duke</dc:creator>
    <dc:date>2013-07-15T19:07:01Z</dc:date>
    <item>
      <title>Best-subset instead of stepwise</title>
      <link>https://communities.sas.com/t5/SAS-Procedures/Best-subset-instead-of-stepwise/m-p/78293#M22598</link>
      <description>&lt;HTML&gt;&lt;HEAD&gt;&lt;/HEAD&gt;&lt;BODY&gt;&lt;P&gt;Best-subset instead of stepwise question.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;Hello, I have classes of individuals grouped together from cluster analysis. I want to use discriminant analysis to determine group membership of new individuals based on a set of predictors. Normally, I use PROC STEPDISC to find a subset of predictors that go into the discriminant analysis, something like:&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;proc stepdisc&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; data=training sle=0.05 singular=0.1;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; class group;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; var VAR1--VAR25&lt;/P&gt;&lt;P&gt;run;&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;However, recent literature indicates stepwise selection is not as good as evaluating all possible subsets of predictors. Is there a procedure, or otherwise, that can do this? I have looked at PHREG REG and LOGISTIC procedures, but they all seem to be based on numerical data rather than classes. Have I missed something? or should I just convert the group&amp;nbsp; data from text to numerical?&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;Thanks in advance.&lt;/P&gt;&lt;P&gt;peat&lt;/P&gt;&lt;/BODY&gt;&lt;/HTML&gt;</description>
      <pubDate>Mon, 15 Jul 2013 00:56:33 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Procedures/Best-subset-instead-of-stepwise/m-p/78293#M22598</guid>
      <dc:creator>peatjohnston</dc:creator>
      <dc:date>2013-07-15T00:56:33Z</dc:date>
    </item>
    <item>
      <title>Re: Best-subset instead of stepwise</title>
      <link>https://communities.sas.com/t5/SAS-Procedures/Best-subset-instead-of-stepwise/m-p/78294#M22599</link>
      <description>&lt;HTML&gt;&lt;HEAD&gt;&lt;/HEAD&gt;&lt;BODY&gt;&lt;P&gt;Best variable subset selection isn't available in PROC STEPDISC. If you have only two groups or if you want to explore group differences two groups at a time, you can perform best variable subset selection in PROC LOGISTIC&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;&lt;STRONG&gt;title "Discriminating groups A and B";&lt;/STRONG&gt;&lt;/P&gt;&lt;P&gt;&lt;STRONG&gt;proc logistic data=training(where=(group in ("A", "B")));&lt;/STRONG&gt;&lt;/P&gt;&lt;P&gt;&lt;STRONG&gt;class group;&lt;/STRONG&gt;&lt;/P&gt;&lt;P&gt;&lt;STRONG&gt;model group(event="B") = VAR1 -- VAR25 / selection=score best=3 stop=5;&lt;/STRONG&gt;&lt;/P&gt;&lt;P&gt;&lt;STRONG&gt;run;&lt;/STRONG&gt;&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;PG&lt;/P&gt;&lt;/BODY&gt;&lt;/HTML&gt;</description>
      <pubDate>Mon, 15 Jul 2013 02:48:36 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Procedures/Best-subset-instead-of-stepwise/m-p/78294#M22599</guid>
      <dc:creator>PGStats</dc:creator>
      <dc:date>2013-07-15T02:48:36Z</dc:date>
    </item>
    <item>
      <title>Re: Best-subset instead of stepwise</title>
      <link>https://communities.sas.com/t5/SAS-Procedures/Best-subset-instead-of-stepwise/m-p/78295#M22600</link>
      <description>&lt;HTML&gt;&lt;HEAD&gt;&lt;/HEAD&gt;&lt;BODY&gt;&lt;P&gt;Hi PG, and thanks for the response. I actually have 4 groups (sometimes more). It looks like I can just use:&lt;/P&gt;&lt;P style="font-size: 12.727272033691406px; font-family: 'Helvetica Neue', Helvetica, Arial, 'Lucida Grande', sans-serif; background-color: #ffffff;"&gt;&lt;STRONG style="font-style: inherit; font-family: inherit;"&gt;proc logistic data=training;&lt;/STRONG&gt;&lt;/P&gt;&lt;P style="font-size: 12.727272033691406px; font-family: 'Helvetica Neue', Helvetica, Arial, 'Lucida Grande', sans-serif; background-color: #ffffff;"&gt;&lt;STRONG style="font-style: inherit; font-family: inherit;"&gt;class group;&lt;/STRONG&gt;&lt;/P&gt;&lt;P style="font-size: 12.727272033691406px; font-family: 'Helvetica Neue', Helvetica, Arial, 'Lucida Grande', sans-serif; background-color: #ffffff;"&gt;&lt;STRONG style="font-style: inherit; font-family: inherit;"&gt;model group= VAR1 -- VAR25 / selection=score best=3 stop=5;&lt;/STRONG&gt;&lt;/P&gt;&lt;P style="font-size: 12.727272033691406px; font-family: 'Helvetica Neue', Helvetica, Arial, 'Lucida Grande', sans-serif; background-color: #ffffff;"&gt;&lt;STRONG style="font-style: inherit; font-family: inherit;"&gt;run;&lt;/STRONG&gt;&lt;/P&gt;&lt;P style="font-size: 12.727272033691406px; font-family: 'Helvetica Neue', Helvetica, Arial, 'Lucida Grande', sans-serif; background-color: #ffffff;"&gt;&lt;/P&gt;&lt;P style="font-size: 12.727272033691406px; font-family: 'Helvetica Neue', Helvetica, Arial, 'Lucida Grande', sans-serif; background-color: #ffffff;"&gt;This is very helpful. However, is there a way to compare the output models for overfitting? e.g. are four preditors really better than three.&lt;/P&gt;&lt;P style="font-size: 12.727272033691406px; font-family: 'Helvetica Neue', Helvetica, Arial, 'Lucida Grande', sans-serif; background-color: #ffffff;"&gt;Cheers,&lt;/P&gt;&lt;P style="font-size: 12.727272033691406px; font-family: 'Helvetica Neue', Helvetica, Arial, 'Lucida Grande', sans-serif; background-color: #ffffff;"&gt;peat&lt;/P&gt;&lt;/BODY&gt;&lt;/HTML&gt;</description>
      <pubDate>Mon, 15 Jul 2013 11:29:15 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Procedures/Best-subset-instead-of-stepwise/m-p/78295#M22600</guid>
      <dc:creator>peatjohnston</dc:creator>
      <dc:date>2013-07-15T11:29:15Z</dc:date>
    </item>
    <item>
      <title>Re: Best-subset instead of stepwise</title>
      <link>https://communities.sas.com/t5/SAS-Procedures/Best-subset-instead-of-stepwise/m-p/78296#M22601</link>
      <description>&lt;HTML&gt;&lt;HEAD&gt;&lt;/HEAD&gt;&lt;BODY&gt;&lt;P&gt;Peat,&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;Probably the best way to address overfitting is with Bootstrapping.&amp;nbsp; there is a substantial literature on it.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;Doc Muhlbaier&lt;/P&gt;&lt;P&gt;Duke&lt;/P&gt;&lt;/BODY&gt;&lt;/HTML&gt;</description>
      <pubDate>Mon, 15 Jul 2013 19:07:01 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Procedures/Best-subset-instead-of-stepwise/m-p/78296#M22601</guid>
      <dc:creator>Doc_Duke</dc:creator>
      <dc:date>2013-07-15T19:07:01Z</dc:date>
    </item>
    <item>
      <title>Re: Best-subset instead of stepwise</title>
      <link>https://communities.sas.com/t5/SAS-Procedures/Best-subset-instead-of-stepwise/m-p/78297#M22602</link>
      <description>&lt;HTML&gt;&lt;HEAD&gt;&lt;/HEAD&gt;&lt;BODY&gt;&lt;P&gt;Thanks Duke, I will look into it.&lt;/P&gt;&lt;P&gt;peat&lt;/P&gt;&lt;/BODY&gt;&lt;/HTML&gt;</description>
      <pubDate>Mon, 15 Jul 2013 20:05:41 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Procedures/Best-subset-instead-of-stepwise/m-p/78297#M22602</guid>
      <dc:creator>peatjohnston</dc:creator>
      <dc:date>2013-07-15T20:05:41Z</dc:date>
    </item>
  </channel>
</rss>

