<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: glmselect with lasso options ends only 2 steps in Statistical Procedures</title>
    <link>https://communities.sas.com/t5/Statistical-Procedures/glmselect-with-lasso-options-ends-only-2-steps/m-p/183748#M9548</link>
    <description>&lt;HTML&gt;&lt;HEAD&gt;&lt;/HEAD&gt;&lt;BODY&gt;&lt;P&gt;if I specify selection=lasso(stop=ADJRSQ); then SAS stop in 2 steps and show:&lt;/P&gt;&lt;P&gt;Selection stopped at a local maximum of the AdjRSq criterion.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;If I specify selection=lasso(stop=SBC);then SAS stop in 2 steps and show:&lt;/P&gt;&lt;P&gt;Selection stopped at a local minimum of the SBC criterion.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;I only get 2 variables. The AdjRSq is pretty low in either test unless I specify steps=20. With STEPS option, the AdjRSq increases, However the purpose of using lasso is to avoid overfitting. I look at the variables and I believe STEPS is giving me overfitting result.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;Look at the correlation between those variables, don't believe all of them are strongly correlated.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;Thanks for your help&lt;/P&gt;&lt;/BODY&gt;&lt;/HTML&gt;</description>
    <pubDate>Wed, 04 Jun 2014 14:11:05 GMT</pubDate>
    <dc:creator>neilxu</dc:creator>
    <dc:date>2014-06-04T14:11:05Z</dc:date>
    <item>
      <title>glmselect with lasso options ends only 2 steps</title>
      <link>https://communities.sas.com/t5/Statistical-Procedures/glmselect-with-lasso-options-ends-only-2-steps/m-p/183746#M9546</link>
      <description>&lt;HTML&gt;&lt;HEAD&gt;&lt;/HEAD&gt;&lt;BODY&gt;&lt;P&gt;&lt;SPAN style="font-family: calibri,verdana,arial,sans-serif; font-size: 12pt;"&gt;This is my first time to use glmselect with lasso options. However the procedure ends very quickly, always 2 steps. I changed the STOP options but no luck. And the result is really bad, R^2 is below 0.3. Don't understand why it just stops.&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN style="font-family: calibri,verdana,arial,sans-serif; font-size: 12pt;"&gt;I have more than 200 IV and only 1 DV (50 records).&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN style="font-family: calibri,verdana,arial,sans-serif; font-size: 12pt;"&gt;&lt;BR /&gt;&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN style="font-family: calibri,verdana,arial,sans-serif; font-size: 12pt;"&gt;Thanks for you input.&lt;/SPAN&gt;&lt;/P&gt;&lt;/BODY&gt;&lt;/HTML&gt;</description>
      <pubDate>Tue, 03 Jun 2014 22:06:12 GMT</pubDate>
      <guid>https://communities.sas.com/t5/Statistical-Procedures/glmselect-with-lasso-options-ends-only-2-steps/m-p/183746#M9546</guid>
      <dc:creator>neilxu</dc:creator>
      <dc:date>2014-06-03T22:06:12Z</dc:date>
    </item>
    <item>
      <title>Re: glmselect with lasso options ends only 2 steps</title>
      <link>https://communities.sas.com/t5/Statistical-Procedures/glmselect-with-lasso-options-ends-only-2-steps/m-p/183747#M9547</link>
      <description>&lt;HTML&gt;&lt;HEAD&gt;&lt;/HEAD&gt;&lt;BODY&gt;&lt;P&gt;Any messages in the log, or in the output?&amp;nbsp; It may be that your 200 IV are highly correlated, and so only two steps are needed to find an optimal set.&amp;nbsp; However, it is hard to tell without more information.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;Steve Denham&lt;/P&gt;&lt;/BODY&gt;&lt;/HTML&gt;</description>
      <pubDate>Wed, 04 Jun 2014 12:58:17 GMT</pubDate>
      <guid>https://communities.sas.com/t5/Statistical-Procedures/glmselect-with-lasso-options-ends-only-2-steps/m-p/183747#M9547</guid>
      <dc:creator>SteveDenham</dc:creator>
      <dc:date>2014-06-04T12:58:17Z</dc:date>
    </item>
    <item>
      <title>Re: glmselect with lasso options ends only 2 steps</title>
      <link>https://communities.sas.com/t5/Statistical-Procedures/glmselect-with-lasso-options-ends-only-2-steps/m-p/183748#M9548</link>
      <description>&lt;HTML&gt;&lt;HEAD&gt;&lt;/HEAD&gt;&lt;BODY&gt;&lt;P&gt;if I specify selection=lasso(stop=ADJRSQ); then SAS stop in 2 steps and show:&lt;/P&gt;&lt;P&gt;Selection stopped at a local maximum of the AdjRSq criterion.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;If I specify selection=lasso(stop=SBC);then SAS stop in 2 steps and show:&lt;/P&gt;&lt;P&gt;Selection stopped at a local minimum of the SBC criterion.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;I only get 2 variables. The AdjRSq is pretty low in either test unless I specify steps=20. With STEPS option, the AdjRSq increases, However the purpose of using lasso is to avoid overfitting. I look at the variables and I believe STEPS is giving me overfitting result.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;Look at the correlation between those variables, don't believe all of them are strongly correlated.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;Thanks for your help&lt;/P&gt;&lt;/BODY&gt;&lt;/HTML&gt;</description>
      <pubDate>Wed, 04 Jun 2014 14:11:05 GMT</pubDate>
      <guid>https://communities.sas.com/t5/Statistical-Procedures/glmselect-with-lasso-options-ends-only-2-steps/m-p/183748#M9548</guid>
      <dc:creator>neilxu</dc:creator>
      <dc:date>2014-06-04T14:11:05Z</dc:date>
    </item>
    <item>
      <title>Re: glmselect with lasso options ends only 2 steps</title>
      <link>https://communities.sas.com/t5/Statistical-Procedures/glmselect-with-lasso-options-ends-only-2-steps/m-p/183749#M9549</link>
      <description>&lt;HTML&gt;&lt;HEAD&gt;&lt;/HEAD&gt;&lt;BODY&gt;&lt;P&gt;&lt;SPAN style="font-size: 10pt;"&gt;This got me thinking a little bit.&amp;nbsp; I used the example in the SAS/STAT 13.1 documentation, with changes.&amp;nbsp; First, I ran:&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN style="font-size: 8pt;"&gt;&amp;nbsp; &lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;proc glmselect data=sashelp.Leutrain plots=coefficients;&lt;/P&gt;&lt;P&gt;model y = x1-x7129/&lt;/P&gt;&lt;P&gt;selection=LASSO(choose=adjrsq);&lt;/P&gt;&lt;P&gt;run;&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;This stopped after four steps.&amp;nbsp; Then I ran:&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;proc glmselect data=sashelp.Leutrain /*valdata=sashelp.Leutest*/&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;plots=coefficients;&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;model y = x1-x7129/&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;selection=LASSO(choose=adjrsq steps=20);&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;run;&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;And this went out the full 20 steps, with the optimal value at step 20.&amp;nbsp; OK, what happened when I did not include the steps= option? Well, the adjRsq criterion actually went down with the inclusion of the fifth predictor, and thus, the procedure stops, with an adjusted Rsq of 0.6132.&amp;nbsp; I think this is what is happening with your data.&amp;nbsp; I can get all sorts of answers from this dataset, based on a combination of options.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;My personal preferences might be to minimize PRESS, rather than maximizing adjusted Rsquare or minimizing information criteria, especially if I were trying to build a predictive model.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;Steve Denham&lt;/P&gt;&lt;/BODY&gt;&lt;/HTML&gt;</description>
      <pubDate>Wed, 04 Jun 2014 17:22:00 GMT</pubDate>
      <guid>https://communities.sas.com/t5/Statistical-Procedures/glmselect-with-lasso-options-ends-only-2-steps/m-p/183749#M9549</guid>
      <dc:creator>SteveDenham</dc:creator>
      <dc:date>2014-06-04T17:22:00Z</dc:date>
    </item>
  </channel>
</rss>

