<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Difference between procedures: LOGISTIC or HPGENSELECT in Statistical Procedures</title>
    <link>https://communities.sas.com/t5/Statistical-Procedures/Difference-between-procedures-LOGISTIC-or-HPGENSELECT/m-p/554229#M27563</link>
    <description>&lt;P&gt;&lt;SPAN class="tlid-translation translation"&gt;&lt;SPAN class=""&gt;I am just starting to learn about the advanced methods of variable selection (lasso, lar, ridge,...). As a start I simply wanted to test the different functionalities of SAS and tried to implement a stepwise regression in PROC HPGENSELECT to compare with PROC LOGISTIC, since this is what both procedures offer. I know that for LASSO I have to use HPGENSELECT. But now few questions already arise:&lt;BR /&gt;&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&lt;SPAN class="tlid-translation translation"&gt;&lt;SPAN class=""&gt;1) I thought that these two syntaxes would do the same:&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;proc logistic data=TEST;
   	class y x1 x2 x3 x4 x5 x6 x7 x8;
	model y = x1 x2 x3 x4 x5 x6 x7 x8 / link=logit selection=stepwise
	                  slentry=0.2
	                  slstay=0.167
	                  details
	                  lackfit;
run;

proc hpgenselect data=TEST;
   	class y x1 x2 x3 x4 x5 x6 x7 x8;
	model y = x1 x2 x3 x4 x5 x6 x7 x8 / link=logit;
	selection method=stepwise(select=sl sle=0.2 sls=0.167 /*stop=SBC*/);
	performance details;
run;&lt;/CODE&gt;&lt;/PRE&gt;&lt;P&gt;Proc logisttic bases its decisions on p-values as required with the&amp;nbsp;&lt;CODE class=" language-sas"&gt;select=sl&lt;/CODE&gt; in the second code (with same entry and exit levels). But results are different. Does the model or algorithm differe between these procedures, and how?&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&lt;SPAN class="tlid-translation translation"&gt;&lt;SPAN class=""&gt;2) For an ordinal logistic regression, with ordinal IVs, must each variable be followed by an &lt;FONT face="courier new,courier"&gt;(param = ordinal)? &lt;/FONT&gt;E.g. &lt;/SPAN&gt;&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;class y x1(param = ordinal) x2(param = ordinal) x3(param = ordinal) x4(param = ordinal) x5(param = ordinal) x6(param = ordinal) x7(param = ordinal) x8(param = ordinal);&lt;/CODE&gt;&lt;/PRE&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&lt;SPAN class="tlid-translation translation"&gt;&lt;SPAN class=""&gt;Thanks in advance!&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;/P&gt;</description>
    <pubDate>Fri, 26 Apr 2019 12:35:14 GMT</pubDate>
    <dc:creator>lotcarrots</dc:creator>
    <dc:date>2019-04-26T12:35:14Z</dc:date>
    <item>
      <title>Difference between procedures: LOGISTIC or HPGENSELECT</title>
      <link>https://communities.sas.com/t5/Statistical-Procedures/Difference-between-procedures-LOGISTIC-or-HPGENSELECT/m-p/554229#M27563</link>
      <description>&lt;P&gt;&lt;SPAN class="tlid-translation translation"&gt;&lt;SPAN class=""&gt;I am just starting to learn about the advanced methods of variable selection (lasso, lar, ridge,...). As a start I simply wanted to test the different functionalities of SAS and tried to implement a stepwise regression in PROC HPGENSELECT to compare with PROC LOGISTIC, since this is what both procedures offer. I know that for LASSO I have to use HPGENSELECT. But now few questions already arise:&lt;BR /&gt;&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&lt;SPAN class="tlid-translation translation"&gt;&lt;SPAN class=""&gt;1) I thought that these two syntaxes would do the same:&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;proc logistic data=TEST;
   	class y x1 x2 x3 x4 x5 x6 x7 x8;
	model y = x1 x2 x3 x4 x5 x6 x7 x8 / link=logit selection=stepwise
	                  slentry=0.2
	                  slstay=0.167
	                  details
	                  lackfit;
run;

proc hpgenselect data=TEST;
   	class y x1 x2 x3 x4 x5 x6 x7 x8;
	model y = x1 x2 x3 x4 x5 x6 x7 x8 / link=logit;
	selection method=stepwise(select=sl sle=0.2 sls=0.167 /*stop=SBC*/);
	performance details;
run;&lt;/CODE&gt;&lt;/PRE&gt;&lt;P&gt;Proc logisttic bases its decisions on p-values as required with the&amp;nbsp;&lt;CODE class=" language-sas"&gt;select=sl&lt;/CODE&gt; in the second code (with same entry and exit levels). But results are different. Does the model or algorithm differe between these procedures, and how?&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&lt;SPAN class="tlid-translation translation"&gt;&lt;SPAN class=""&gt;2) For an ordinal logistic regression, with ordinal IVs, must each variable be followed by an &lt;FONT face="courier new,courier"&gt;(param = ordinal)? &lt;/FONT&gt;E.g. &lt;/SPAN&gt;&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;class y x1(param = ordinal) x2(param = ordinal) x3(param = ordinal) x4(param = ordinal) x5(param = ordinal) x6(param = ordinal) x7(param = ordinal) x8(param = ordinal);&lt;/CODE&gt;&lt;/PRE&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&lt;SPAN class="tlid-translation translation"&gt;&lt;SPAN class=""&gt;Thanks in advance!&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;/P&gt;</description>
      <pubDate>Fri, 26 Apr 2019 12:35:14 GMT</pubDate>
      <guid>https://communities.sas.com/t5/Statistical-Procedures/Difference-between-procedures-LOGISTIC-or-HPGENSELECT/m-p/554229#M27563</guid>
      <dc:creator>lotcarrots</dc:creator>
      <dc:date>2019-04-26T12:35:14Z</dc:date>
    </item>
    <item>
      <title>Re: Difference between procedures: LOGISTIC or HPGENSELECT</title>
      <link>https://communities.sas.com/t5/Statistical-Procedures/Difference-between-procedures-LOGISTIC-or-HPGENSELECT/m-p/554271#M27564</link>
      <description>&lt;P&gt;1) Are you sure the MODELS are different, or is it just that the parameterization of the CLASS variables are different?&lt;/P&gt;
&lt;P&gt;The LOGISTIC procedure uses an EFFECT parameterization to build the design matrix.&lt;/P&gt;
&lt;P&gt;The HP procedures use the GLM parameterization as a default.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;This will result in different parameter estimates. To use the same parameterization, change the&amp;nbsp;LOGISTIC procedure to use the GLM parameterization by using&lt;/P&gt;
&lt;P&gt;CLASS y x1 x2 x3 x4 x5 x6 x7 x8 / PARAM=GLM;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;2) No. The PARAM=ORDINAL option has nothing to do with ordinal variables. it is &lt;A href="https://go.documentation.sas.com/?docsetId=statug&amp;amp;docsetTarget=statug_logistic_syntax05.htm&amp;amp;docsetVersion=15.1&amp;amp;locale=en" target="_self"&gt;the name of a parameterization&lt;/A&gt;&amp;nbsp;and determines how the design matrix is constructed. I suggest you stick with the more familiar parameterizations, which are easier to interpret.&lt;/P&gt;</description>
      <pubDate>Fri, 26 Apr 2019 14:20:12 GMT</pubDate>
      <guid>https://communities.sas.com/t5/Statistical-Procedures/Difference-between-procedures-LOGISTIC-or-HPGENSELECT/m-p/554271#M27564</guid>
      <dc:creator>Rick_SAS</dc:creator>
      <dc:date>2019-04-26T14:20:12Z</dc:date>
    </item>
    <item>
      <title>Re: Difference between procedures: LOGISTIC or HPGENSELECT</title>
      <link>https://communities.sas.com/t5/Statistical-Procedures/Difference-between-procedures-LOGISTIC-or-HPGENSELECT/m-p/554652#M27568</link>
      <description>&lt;P&gt;Oh, &lt;SPAN class="tlid-translation translation"&gt;&lt;SPAN&gt;I just skipped the default settings - &lt;/SPAN&gt;&lt;SPAN class=""&gt;silly me... Thanks a lot for your advice! Also on the second question.&lt;BR /&gt;&lt;BR /&gt;However,&lt;/SPAN&gt;&lt;/SPAN&gt; for the first question, the codes still result in different models:&lt;/P&gt;&lt;P&gt;(I) PROC LOGISTIC selects a model with 5 variables&lt;/P&gt;&lt;P&gt;(II) HP yields an intercept only model&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;I found out that &lt;SPAN class="tlid-translation translation"&gt;&lt;SPAN class=""&gt;the 'Optimization Technique' causes this difference. &lt;/SPAN&gt;&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;Model information reports for (I) logistic:&lt;/P&gt;&lt;DIV class="branch"&gt;&lt;DIV&gt;&lt;DIV align="center"&gt;Optimization Technique &lt;TABLE cellspacing="0" cellpadding="5"&gt;&lt;TBODY&gt;&lt;TR&gt;&lt;TD&gt;Fisher's scoring&lt;/TD&gt;&lt;/TR&gt;&lt;/TBODY&gt;&lt;/TABLE&gt;&lt;/DIV&gt;&lt;/DIV&gt;&lt;/DIV&gt;&lt;P&gt;and for (II) HP:&lt;/P&gt;&lt;DIV class="branch"&gt;&lt;DIV&gt;&lt;DIV align="center"&gt;Optimization Technique &lt;TABLE cellspacing="0" cellpadding="5"&gt;&lt;TBODY&gt;&lt;TR&gt;&lt;TD&gt;Newton-Raphson with Ridging&lt;/TD&gt;&lt;/TR&gt;&lt;/TBODY&gt;&lt;/TABLE&gt;&lt;/DIV&gt;&lt;/DIV&gt;&lt;/DIV&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Adding &lt;FONT face="courier new,courier"&gt;technique=newton&lt;/FONT&gt; to PROC LOGISTIC &lt;SPAN class="tlid-translation translation"&gt;&lt;SPAN class=""&gt;also leads to an intercept only model (now matching with HP).&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&lt;SPAN class="tlid-translation translation"&gt;&lt;SPAN class=""&gt;As I understand, parameter estimates are not supposed to differ between methods. &lt;/SPAN&gt;&lt;/SPAN&gt;&lt;SPAN class="tlid-translation translation"&gt;&lt;SPAN class=""&gt;For generalized logit models only the Newton-Raphson technique is available (&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;SPAN class="tlid-translation translation"&gt;&lt;SPAN class=""&gt;&lt;A href="https://support.sas.com/documentation/cdl/en/statug/63347/HTML/default/viewer.htm#statug_logistic_sect033.htm" target="_blank" rel="noopener"&gt;https://support.sas.com/documentation/cdl/en/statug/63347/HTML/default/viewer.htm#statug_logistic_sect033.htm&lt;/A&gt;). But apparently, the two methods lead to different variable selections.&lt;BR /&gt;&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&lt;SPAN class="tlid-translation translation"&gt;&lt;SPAN class=""&gt;For comparison: Lasso (Optimization Technique "Nesterov") also chooses the intercept model as this one has the lowest SBC. Mh...&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;/P&gt;&lt;DIV class="text-wrap tlid-copy-target"&gt;&lt;DIV class="result-shield-container tlid-copy-target"&gt;&lt;SPAN class="tlid-translation translation"&gt;&lt;SPAN class=""&gt;that means that none of my variables explains much then. &lt;span class="lia-unicode-emoji" title=":disappointed_face:"&gt;😞&lt;/span&gt;&lt;BR /&gt;&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;/DIV&gt;&lt;/DIV&gt;</description>
      <pubDate>Mon, 29 Apr 2019 11:40:24 GMT</pubDate>
      <guid>https://communities.sas.com/t5/Statistical-Procedures/Difference-between-procedures-LOGISTIC-or-HPGENSELECT/m-p/554652#M27568</guid>
      <dc:creator>lotcarrots</dc:creator>
      <dc:date>2019-04-29T11:40:24Z</dc:date>
    </item>
  </channel>
</rss>

