<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Why is SAS providing a coefficient estimate when a variable predicts failure perfectly? in SAS Procedures</title>
    <link>https://communities.sas.com/t5/SAS-Procedures/Why-is-SAS-providing-a-coefficient-estimate-when-a-variable/m-p/473232#M71030</link>
    <description>&lt;P&gt;I'm running a model similar to the following:&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;PRE&gt;proc logistic data=table;
  model Y = X1 X2 X1*X2 X3 X4 X5;
run;&amp;nbsp;&lt;/PRE&gt;&lt;P&gt;In this model, Y equals 0 or 1 while X1 and X2 are&amp;nbsp;indicator variables (equal to 0 or 1) and X3, X4, and X5 are continuous. In this sample, Y = 0 for all observations where X1*X2 = 1. Thus, &lt;SPAN&gt;X1*X2&lt;/SPAN&gt; should not be estimable. However, SAS still provides a point estimate and a statistically significant p value for&amp;nbsp;&lt;SPAN&gt;X1*X2&lt;/SPAN&gt; &lt;U&gt;without displaying any error or warning&lt;/U&gt; in the log such as separation of data points. As far as SAS is concerned, "convergence criterion (GCONV=1E-8) satisfied" and all is dandy in the world.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Why? What is going on? Surely SAS shouldn't be behaving this way? When running this same model on the same sample in Stata, Stata appropriately drops &lt;SPAN&gt;X1*X2&lt;/SPAN&gt; when estimating this model.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Any insights on this would be great.&lt;/P&gt;</description>
    <pubDate>Tue, 26 Jun 2018 05:10:24 GMT</pubDate>
    <dc:creator>BobSmith</dc:creator>
    <dc:date>2018-06-26T05:10:24Z</dc:date>
    <item>
      <title>Why is SAS providing a coefficient estimate when a variable predicts failure perfectly?</title>
      <link>https://communities.sas.com/t5/SAS-Procedures/Why-is-SAS-providing-a-coefficient-estimate-when-a-variable/m-p/473232#M71030</link>
      <description>&lt;P&gt;I'm running a model similar to the following:&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;PRE&gt;proc logistic data=table;
  model Y = X1 X2 X1*X2 X3 X4 X5;
run;&amp;nbsp;&lt;/PRE&gt;&lt;P&gt;In this model, Y equals 0 or 1 while X1 and X2 are&amp;nbsp;indicator variables (equal to 0 or 1) and X3, X4, and X5 are continuous. In this sample, Y = 0 for all observations where X1*X2 = 1. Thus, &lt;SPAN&gt;X1*X2&lt;/SPAN&gt; should not be estimable. However, SAS still provides a point estimate and a statistically significant p value for&amp;nbsp;&lt;SPAN&gt;X1*X2&lt;/SPAN&gt; &lt;U&gt;without displaying any error or warning&lt;/U&gt; in the log such as separation of data points. As far as SAS is concerned, "convergence criterion (GCONV=1E-8) satisfied" and all is dandy in the world.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Why? What is going on? Surely SAS shouldn't be behaving this way? When running this same model on the same sample in Stata, Stata appropriately drops &lt;SPAN&gt;X1*X2&lt;/SPAN&gt; when estimating this model.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Any insights on this would be great.&lt;/P&gt;</description>
      <pubDate>Tue, 26 Jun 2018 05:10:24 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Procedures/Why-is-SAS-providing-a-coefficient-estimate-when-a-variable/m-p/473232#M71030</guid>
      <dc:creator>BobSmith</dc:creator>
      <dc:date>2018-06-26T05:10:24Z</dc:date>
    </item>
    <item>
      <title>Re: Why is SAS providing a coefficient estimate when a variable predicts failure perfectly?</title>
      <link>https://communities.sas.com/t5/SAS-Procedures/Why-is-SAS-providing-a-coefficient-estimate-when-a-variable/m-p/473234#M71031</link>
      <description>&lt;P&gt;If X1 and X2 are binary variables, you should not treat them as regression variables. Put a Class Statement above your Model Statement like this&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;class X1 X2;&lt;/CODE&gt;&lt;/PRE&gt;</description>
      <pubDate>Tue, 26 Jun 2018 04:41:19 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Procedures/Why-is-SAS-providing-a-coefficient-estimate-when-a-variable/m-p/473234#M71031</guid>
      <dc:creator>PeterClemmensen</dc:creator>
      <dc:date>2018-06-26T04:41:19Z</dc:date>
    </item>
    <item>
      <title>Re: Why is SAS providing a coefficient estimate when a variable predicts failure perfectly?</title>
      <link>https://communities.sas.com/t5/SAS-Procedures/Why-is-SAS-providing-a-coefficient-estimate-when-a-variable/m-p/473238#M71033</link>
      <description>&lt;P&gt;Looks to me like X2 is an excellent predictor for Y. Colinearity is a problem when it occurs between predictors, in which case it is sometimes better to drop one of the culprits. But one does expect some sort of relationship between the dependent variable and its predictors. Issuing a&amp;nbsp;note when that relationship is a little too&amp;nbsp;perfect might be a good idea though.&lt;/P&gt;</description>
      <pubDate>Tue, 26 Jun 2018 05:02:16 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Procedures/Why-is-SAS-providing-a-coefficient-estimate-when-a-variable/m-p/473238#M71033</guid>
      <dc:creator>PGStats</dc:creator>
      <dc:date>2018-06-26T05:02:16Z</dc:date>
    </item>
    <item>
      <title>Re: Why is SAS providing a coefficient estimate when a variable predicts failure perfectly?</title>
      <link>https://communities.sas.com/t5/SAS-Procedures/Why-is-SAS-providing-a-coefficient-estimate-when-a-variable/m-p/473239#M71034</link>
      <description>&lt;BLOCKQUOTE&gt;&lt;P&gt;Looks to me like X2 is an excellent predictor for Y. Colinearity is a problem when it occurs between predictors, in which case it is sometimes better to drop one of the culprits. But one does expect some sort of relationship between the dependent variable and its predictors. Issuing a&amp;nbsp;note when that relationship is a little too&amp;nbsp;perfect might be a good idea though.&lt;/P&gt;&lt;DIV class="UserSignature lia-message-signature"&gt;&amp;nbsp;&lt;/DIV&gt;&lt;DIV class="UserSignature lia-message-signature"&gt;PG&lt;/DIV&gt;&lt;/BLOCKQUOTE&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Edited my original post to clarify the model. However, the original point still stands. You should not be able to estimate a point estimate for a variable in a logistic model via maximum likelihood if that variable has no variation in Y. For example, see&amp;nbsp;&lt;A href="http://support.sas.com/rnd/app/stat/papers/logistic.pdf" target="_blank"&gt;http://support.sas.com/rnd/app/stat/papers/logistic.pdf&lt;/A&gt; or&amp;nbsp;&lt;A href="https://www.statalist.org/forums/forum/general-stata-discussion/general/1357105-stata-omits-variables-how-can-i-deal-with-it?p=1357119#post1357119" target="_blank"&gt;https://www.statalist.org/forums/forum/general-stata-discussion/general/1357105-stata-omits-variables-how-can-i-deal-with-it?p=1357119#post1357119&lt;/A&gt;&amp;nbsp;or page 5 of&amp;nbsp;&lt;A href="https://www.stata.com/manuals13/rlogit.pdf" target="_blank"&gt;https://www.stata.com/manuals13/rlogit.pdf&lt;/A&gt;.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;I would expect SAS to at least throw a warning or an error when this happens. It should not be providing a point estimate with p values and pretending like nothing is wrong. Does anyone know why SAS is behaving this way?&lt;/P&gt;</description>
      <pubDate>Tue, 26 Jun 2018 05:38:12 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Procedures/Why-is-SAS-providing-a-coefficient-estimate-when-a-variable/m-p/473239#M71034</guid>
      <dc:creator>BobSmith</dc:creator>
      <dc:date>2018-06-26T05:38:12Z</dc:date>
    </item>
    <item>
      <title>Re: Why is SAS providing a coefficient estimate when a variable predicts failure perfectly?</title>
      <link>https://communities.sas.com/t5/SAS-Procedures/Why-is-SAS-providing-a-coefficient-estimate-when-a-variable/m-p/473343#M71035</link>
      <description>&lt;P&gt;You haven't provided data, so there is not a lot we can say. Issues like this&amp;nbsp;usually require looking at the data.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;I can say that when I try to reproduce your claim by using a simulation, SAS reports the error that you are expecting. Try running the code below. Do you see these warnings? If so, maybe your data are not what you believe them to be.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&lt;STRONG&gt;SAS Log:&lt;/STRONG&gt;&lt;/P&gt;
&lt;P&gt;WARNING: There is possibly a quasi-complete separation of data points.&lt;BR /&gt;The maximum likelihood estimate may not exist.&lt;BR /&gt;WARNING: The LOGISTIC procedure continues in spite of the above warning.&lt;BR /&gt;Results shown are based on the last maximum likelihood&lt;BR /&gt;iteration. Validity of the model fit is questionable.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&lt;STRONG&gt;SAS Output:&lt;/STRONG&gt;&lt;/P&gt;
&lt;P&gt;SAS Output&lt;/P&gt;
&lt;DIV class="branch"&gt;
&lt;DIV&gt;
&lt;DIV align="center"&gt;
&lt;TABLE class="table" summary="Procedure Logistic: Convergence Status" frame="box" rules="all" cellspacing="0" cellpadding="5"&gt;
&lt;THEAD&gt;
&lt;TR&gt;
&lt;TH class="c b header" scope="col"&gt;Model Convergence Status&lt;/TH&gt;
&lt;/TR&gt;
&lt;/THEAD&gt;
&lt;TBODY&gt;
&lt;TR&gt;
&lt;TD class="c data"&gt;Quasi-complete separation of data points detected.&lt;/TD&gt;
&lt;/TR&gt;
&lt;/TBODY&gt;
&lt;/TABLE&gt;
&lt;/DIV&gt;
&lt;/DIV&gt;
&lt;BR /&gt;&lt;BR /&gt;
&lt;DIV align="center"&gt;
&lt;TABLE class="warncontent"&gt;
&lt;TBODY&gt;
&lt;TR&gt;
&lt;TD class="c warnbanner"&gt;Warning:&lt;/TD&gt;
&lt;TD class="l warncontent"&gt;The maximum likelihood estimate may not exist.&lt;/TD&gt;
&lt;/TR&gt;
&lt;/TBODY&gt;
&lt;/TABLE&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;/DIV&gt;
&lt;/DIV&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;data Have;
call streaminit(1234);
do i = 1 to 200;
   x1 = rand("Bernoulli", 0.7);
   x2 = rand("Bernoulli", 0.5);
   x3 = rand("Normal", 2, 3);
   x4 = rand("Normal", 0, 1);
   x5 = rand("Normal", -1, 2);
   eta = x1 - x2 + 0.5*x1*x2 + x3 - 2*x4 + 3*x5;
   if x1*x2=1 then 
      Y = 1;
   else
      Y = rand("Bernoulli", logistic(eta));
   output;
end;
run;

proc logistic data=Have;
 class x1 x2;
 model Y(event='1') = X1 X2 X1*X2 X3 X4 X5;  /* quasi-separation */
 *model Y = X1 X2 X3 X4 X5;  /* model OK */
run;&lt;/CODE&gt;&lt;/PRE&gt;</description>
      <pubDate>Tue, 26 Jun 2018 12:30:53 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Procedures/Why-is-SAS-providing-a-coefficient-estimate-when-a-variable/m-p/473343#M71035</guid>
      <dc:creator>Rick_SAS</dc:creator>
      <dc:date>2018-06-26T12:30:53Z</dc:date>
    </item>
    <item>
      <title>Re: Why is SAS providing a coefficient estimate when a variable predicts failure perfectly?</title>
      <link>https://communities.sas.com/t5/SAS-Procedures/Why-is-SAS-providing-a-coefficient-estimate-when-a-variable/m-p/473499#M71045</link>
      <description>&lt;P&gt;I can't provide the data on a public form. However, I know usually that a warning message is displayed. I've seen complete or quasi-separation of data point warning messages before. (I get the quasi-separation of data points warning when running your code.) In my case, however, no warning is being displayed. I assure you my data is as described. Plus, Stata behaves exactly as expected&amp;nbsp;by dropping the variable so...&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Maybe I could privately share the dataset with someone at SAS who can diagnose? This may be a rare edge case. SAS has been known to provide misleading coefficients before without appropriate warning messages (&lt;A href="https://pdfs.semanticscholar.org/4f17/1322108dff719da6aa0d354d5f73c9c474de.pdf" target="_blank"&gt;https://pdfs.semanticscholar.org/4f17/1322108dff719da6aa0d354d5f73c9c474de.pdf&lt;/A&gt;).&lt;/P&gt;</description>
      <pubDate>Tue, 26 Jun 2018 17:51:42 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Procedures/Why-is-SAS-providing-a-coefficient-estimate-when-a-variable/m-p/473499#M71045</guid>
      <dc:creator>BobSmith</dc:creator>
      <dc:date>2018-06-26T17:51:42Z</dc:date>
    </item>
    <item>
      <title>Re: Why is SAS providing a coefficient estimate when a variable predicts failure perfectly?</title>
      <link>https://communities.sas.com/t5/SAS-Procedures/Why-is-SAS-providing-a-coefficient-estimate-when-a-variable/m-p/473526#M71047</link>
      <description>&lt;P&gt;&lt;A href="https://support.sas.com/en/technical-support/contact-sas.html" target="_self"&gt;SAS Technical Support &lt;/A&gt;is always happy to help.&lt;/P&gt;</description>
      <pubDate>Tue, 26 Jun 2018 19:21:23 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Procedures/Why-is-SAS-providing-a-coefficient-estimate-when-a-variable/m-p/473526#M71047</guid>
      <dc:creator>Rick_SAS</dc:creator>
      <dc:date>2018-06-26T19:21:23Z</dc:date>
    </item>
  </channel>
</rss>

