<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: HPGENSELECT - interpretation of LASSO coefficients in Statistical Procedures</title>
    <link>https://communities.sas.com/t5/Statistical-Procedures/HPGENSELECT-interpretation-of-LASSO-coefficients/m-p/763260#M37263</link>
    <description>&lt;P&gt;Agreeing with&amp;nbsp;&lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/15363"&gt;@SteveDenham&lt;/a&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;The different parameterizations are the same model. Interpreting the coefficients is the part that trips people up, but LSMEANS eliminates all of that confusion. I wrote a post about this issue (although in a simpler example).&amp;nbsp;&lt;A href="https://communities.sas.com/t5/Statistical-Procedures/Interpreting-Multivariate-Linear-Regression-with-Categorical/m-p/591230#M28913" target="_blank"&gt;https://communities.sas.com/t5/Statistical-Procedures/Interpreting-Multivariate-Linear-Regression-with-Categorical/m-p/591230#M28913&lt;/A&gt;&lt;/P&gt;</description>
    <pubDate>Mon, 23 Aug 2021 14:46:28 GMT</pubDate>
    <dc:creator>PaigeMiller</dc:creator>
    <dc:date>2021-08-23T14:46:28Z</dc:date>
    <item>
      <title>HPGENSELECT - interpretation of LASSO coefficients</title>
      <link>https://communities.sas.com/t5/Statistical-Procedures/HPGENSELECT-interpretation-of-LASSO-coefficients/m-p/762313#M37221</link>
      <description>&lt;P&gt;Hello fellow SAS users and SAS support,&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;I have been using&lt;STRONG&gt; HPGENSELECT with LASSO selection&lt;/STRONG&gt; for a &lt;STRONG&gt;binary dependent variable&lt;/STRONG&gt;, and was hoping for clarification regarding the details of the LASSO penalization method and the resulting coefficients. I will post my SAS code at the end. My two questions are:&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;OL&gt;&lt;LI&gt;When HPGENSELECT has been called with the LASSO option and there are CLASS variables present, does it perform&lt;STRONG&gt; group LASSO &lt;/STRONG&gt;optimization, in which the categories of a class variable are either all selected or all set to zero? This is in contrast to regular LASSO, in which some categories might have a non-zero coefficient but others do not; the fact that they belong to a single effect is ignored.&lt;BR /&gt;&lt;BR /&gt;&lt;/LI&gt;&lt;LI&gt;When I use the PARAM = GLM option in the CLASS statement, I seem to invoke less-than-full-rank parameterization of the categorical variables. This means that each level of a class variable gets a dummy variable and all dummy variables are entered into the model. This is not estimable for OLS or maximum likelihood, so a reference category is forced, but LASSO can handle overparameterized models. My question is, how does one then &lt;STRONG&gt;interpret the coefficients&lt;/STRONG&gt;? Is it done by calculating the contrasts manually?&amp;nbsp;&lt;BR /&gt;&lt;BR /&gt;For example, take my screenshot below of the parameter estimates on the log-odds scale. The variable "Location" has only 4 levels in the data, all of which are present in the fitted model. If one were interested in say comparing Locations 2 through 4 to Location 1 as a reference category, would you calculate the difference in estimates on the log-odds scale (e.g. 0.026 versus 0.074) and then exponentiate to obtain familiar odds ratios?&lt;BR /&gt;&lt;BR /&gt;&lt;span class="lia-inline-image-display-wrapper lia-image-align-center" image-alt="HPGENSELECT LASSO.png" style="width: 400px;"&gt;&lt;img src="https://communities.sas.com/t5/image/serverpage/image-id/62700i2358A395AD3595A5/image-size/medium?v=v2&amp;amp;px=400" role="button" title="HPGENSELECT LASSO.png" alt="HPGENSELECT LASSO.png" /&gt;&lt;/span&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;Thanks very much for any insight you can provide!&amp;nbsp;&lt;BR /&gt;&lt;BR /&gt;SAS code below, if it helps. Note that this is from SAS version 9.4,&amp;nbsp;SAS/STAT 15.1&lt;/LI&gt;&lt;/OL&gt;&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;PROC HPGENSELECT data=my_data LASSORHO=.80 LASSOSTEPS=20;
WHERE  location NOTIN (5,6);
CLASS  gender location Physiologic_difficult_AW &amp;lt;many more predictors&amp;gt;
         / param=GLM;
MODEL  Number_attempts = 
       gender location Physiologic_difficult_AW &amp;lt;many more predictors&amp;gt; / DISTRIBUTION=BINARY ;
SELECTION METHOD=LASSO(CHOOSE=AIC) DETAILS=ALL;
RUN;&lt;/CODE&gt;&lt;/PRE&gt;</description>
      <pubDate>Wed, 18 Aug 2021 15:23:59 GMT</pubDate>
      <guid>https://communities.sas.com/t5/Statistical-Procedures/HPGENSELECT-interpretation-of-LASSO-coefficients/m-p/762313#M37221</guid>
      <dc:creator>dufaultb</dc:creator>
      <dc:date>2021-08-18T15:23:59Z</dc:date>
    </item>
    <item>
      <title>Re: HPGENSELECT - interpretation of LASSO coefficients</title>
      <link>https://communities.sas.com/t5/Statistical-Procedures/HPGENSELECT-interpretation-of-LASSO-coefficients/m-p/763011#M37246</link>
      <description>&lt;P&gt;Please check this paper on HPGENSELECT :&lt;A href="https://support.sas.com/resources/papers/proceedings15/SAS1742-2015.pdf" target="_blank"&gt;https://support.sas.com/resources/papers/proceedings15/SAS1742-2015.pdf&lt;/A&gt;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Sat, 21 Aug 2021 06:56:46 GMT</pubDate>
      <guid>https://communities.sas.com/t5/Statistical-Procedures/HPGENSELECT-interpretation-of-LASSO-coefficients/m-p/763011#M37246</guid>
      <dc:creator>gcjfernandez</dc:creator>
      <dc:date>2021-08-21T06:56:46Z</dc:date>
    </item>
    <item>
      <title>Re: HPGENSELECT - interpretation of LASSO coefficients</title>
      <link>https://communities.sas.com/t5/Statistical-Procedures/HPGENSELECT-interpretation-of-LASSO-coefficients/m-p/763252#M37261</link>
      <description>Thank you. This paper suggests that Group LASSO is invoked by HPGENSELECT, which answers my first question. The example shown there uses PARAM=REF, which does not address the second question however.</description>
      <pubDate>Mon, 23 Aug 2021 14:30:08 GMT</pubDate>
      <guid>https://communities.sas.com/t5/Statistical-Procedures/HPGENSELECT-interpretation-of-LASSO-coefficients/m-p/763252#M37261</guid>
      <dc:creator>dufaultb</dc:creator>
      <dc:date>2021-08-23T14:30:08Z</dc:date>
    </item>
    <item>
      <title>Re: HPGENSELECT - interpretation of LASSO coefficients</title>
      <link>https://communities.sas.com/t5/Statistical-Procedures/HPGENSELECT-interpretation-of-LASSO-coefficients/m-p/763255#M37262</link>
      <description>&lt;P&gt;The answer to the second question is "Yes", but there might be better ways of comparing.&amp;nbsp; Once a model is selected, you could fit using GENMOD and use the LSMEANS statement with the ODDSRATIO option.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;SteveDenham&lt;/P&gt;</description>
      <pubDate>Mon, 23 Aug 2021 14:37:55 GMT</pubDate>
      <guid>https://communities.sas.com/t5/Statistical-Procedures/HPGENSELECT-interpretation-of-LASSO-coefficients/m-p/763255#M37262</guid>
      <dc:creator>SteveDenham</dc:creator>
      <dc:date>2021-08-23T14:37:55Z</dc:date>
    </item>
    <item>
      <title>Re: HPGENSELECT - interpretation of LASSO coefficients</title>
      <link>https://communities.sas.com/t5/Statistical-Procedures/HPGENSELECT-interpretation-of-LASSO-coefficients/m-p/763260#M37263</link>
      <description>&lt;P&gt;Agreeing with&amp;nbsp;&lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/15363"&gt;@SteveDenham&lt;/a&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;The different parameterizations are the same model. Interpreting the coefficients is the part that trips people up, but LSMEANS eliminates all of that confusion. I wrote a post about this issue (although in a simpler example).&amp;nbsp;&lt;A href="https://communities.sas.com/t5/Statistical-Procedures/Interpreting-Multivariate-Linear-Regression-with-Categorical/m-p/591230#M28913" target="_blank"&gt;https://communities.sas.com/t5/Statistical-Procedures/Interpreting-Multivariate-Linear-Regression-with-Categorical/m-p/591230#M28913&lt;/A&gt;&lt;/P&gt;</description>
      <pubDate>Mon, 23 Aug 2021 14:46:28 GMT</pubDate>
      <guid>https://communities.sas.com/t5/Statistical-Procedures/HPGENSELECT-interpretation-of-LASSO-coefficients/m-p/763260#M37263</guid>
      <dc:creator>PaigeMiller</dc:creator>
      <dc:date>2021-08-23T14:46:28Z</dc:date>
    </item>
    <item>
      <title>Re: HPGENSELECT - interpretation of LASSO coefficients</title>
      <link>https://communities.sas.com/t5/Statistical-Procedures/HPGENSELECT-interpretation-of-LASSO-coefficients/m-p/763270#M37264</link>
      <description>&lt;P&gt;Thanks very much for your helpful reply. I think LSMEANS is a lovely tool and certainly would be useful here.&amp;nbsp;&lt;BR /&gt;&lt;BR /&gt;&lt;/P&gt;&lt;P&gt;Just one tangential comment regarding:&lt;/P&gt;&lt;BLOCKQUOTE&gt;&lt;HR /&gt;&lt;P&gt;The different parameterizations are the same model.&lt;/P&gt;&lt;HR /&gt;&lt;/BLOCKQUOTE&gt;&lt;P&gt;This is generally true; fit statistics are invariant to parameterization for OLS and ML models. However, with LASSO,&amp;nbsp;&lt;EM&gt;the choice of parameterization can affect variable selection and shrinkage estimates!&amp;nbsp;&lt;/EM&gt;&amp;nbsp;In a way this makes sense. If we choose a reference category that lies in the middle of the others with respect to the outcome, the contrasting coefficients will be small and could get "shrunk away" to zero during optimization. Group LASSO is less vulnerable.&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Mon, 23 Aug 2021 15:22:53 GMT</pubDate>
      <guid>https://communities.sas.com/t5/Statistical-Procedures/HPGENSELECT-interpretation-of-LASSO-coefficients/m-p/763270#M37264</guid>
      <dc:creator>dufaultb</dc:creator>
      <dc:date>2021-08-23T15:22:53Z</dc:date>
    </item>
    <item>
      <title>Re: HPGENSELECT - interpretation of LASSO coefficients</title>
      <link>https://communities.sas.com/t5/Statistical-Procedures/HPGENSELECT-interpretation-of-LASSO-coefficients/m-p/763273#M37265</link>
      <description>The "yes" confirmation is quite helpful, thank you very much.&lt;BR /&gt;&lt;BR /&gt;I might be reluctant to use a secondary GLM procedure to calculate the contrasts since the regression weights will be re-estimated without shrinkage, whereas the shrunk estimates might be more reliable from a cross-validation / reproducibility point of view. But this is an ongoing conversation in the literature, to my knowledge.&lt;BR /&gt;&lt;BR /&gt;Thanks again.</description>
      <pubDate>Mon, 23 Aug 2021 15:39:41 GMT</pubDate>
      <guid>https://communities.sas.com/t5/Statistical-Procedures/HPGENSELECT-interpretation-of-LASSO-coefficients/m-p/763273#M37265</guid>
      <dc:creator>dufaultb</dc:creator>
      <dc:date>2021-08-23T15:39:41Z</dc:date>
    </item>
    <item>
      <title>Re: HPGENSELECT - interpretation of LASSO coefficients</title>
      <link>https://communities.sas.com/t5/Statistical-Procedures/HPGENSELECT-interpretation-of-LASSO-coefficients/m-p/763293#M37271</link>
      <description>&lt;P&gt;Is your dataset so large that you have to use HPGENSELECT, rather than GLMSELECT?&amp;nbsp; Because if you can use the latter to do the LASSO selection, you have access to the STORE statement, from which you can use PLM to get least squares means and odds ratios.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;SteveDenham&lt;/P&gt;</description>
      <pubDate>Mon, 23 Aug 2021 16:47:21 GMT</pubDate>
      <guid>https://communities.sas.com/t5/Statistical-Procedures/HPGENSELECT-interpretation-of-LASSO-coefficients/m-p/763293#M37271</guid>
      <dc:creator>SteveDenham</dc:creator>
      <dc:date>2021-08-23T16:47:21Z</dc:date>
    </item>
    <item>
      <title>Re: HPGENSELECT - interpretation of LASSO coefficients</title>
      <link>https://communities.sas.com/t5/Statistical-Procedures/HPGENSELECT-interpretation-of-LASSO-coefficients/m-p/763884#M37280</link>
      <description>Great idea - will proceed as you suggest</description>
      <pubDate>Wed, 25 Aug 2021 15:47:43 GMT</pubDate>
      <guid>https://communities.sas.com/t5/Statistical-Procedures/HPGENSELECT-interpretation-of-LASSO-coefficients/m-p/763884#M37280</guid>
      <dc:creator>dufaultb</dc:creator>
      <dc:date>2021-08-25T15:47:43Z</dc:date>
    </item>
  </channel>
</rss>

