<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: Strange with quasi-separation of data points in Statistical Procedures</title>
    <link>https://communities.sas.com/t5/Statistical-Procedures/Strange-with-quasi-separation-of-data-points/m-p/234775#M12402</link>
    <description>&lt;P&gt;Hi Thomas,&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;First of all, there is a minor discrepancy between the attached data and the&amp;nbsp;frequency counts you provide: Only after deleting observations 14, 15 and 16 (which look a bit misplaced) my frequency counts match yours.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;But this doesn't change the obvious fact that there is, in fact, a quasi-complete separation of data points in each of the four cases (models "Y=X1" and "Y=X2" with or without the above data correction): The minimum value of X in the subset of data points (X, Y) with Y=0 equals the maximum&amp;nbsp;&lt;SPAN&gt;value of X in the subset of data points (X, Y) with Y=1. Both are zero.&lt;/SPAN&gt;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&lt;SPAN&gt;The difference between the two scenarios (X1 vs. X2) is just that for X2 the iterative process used to compute the maximum likelihood estimates appears to converge: The convergence criterion is met -- in spite of the quasi-complete separation.&amp;nbsp;This is documented in the output where it says (under "Model Convergence Status"): "Convergence criterion (GCONV=1E-8) satisfied."&lt;BR /&gt;&lt;BR /&gt;&lt;/SPAN&gt;&lt;/P&gt;
&lt;P&gt;&lt;SPAN&gt;That 1E-8 is the default setting of the GCONV= option. If you tighten the convergence criterion only a little bit -- to GCONV=0.92E-8 or less in this example --, it will no longer be met and you'll get the familiar warning about quasi-complete separation also for X2:&lt;/SPAN&gt;&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;proc logistic data=test desc;
model y=x2 / gconv=0.92e-8;
run;
&lt;/CODE&gt;&lt;/PRE&gt;
&lt;P&gt;&lt;SPAN&gt;With or without that warning, the "telltale signs of quasi-complete separation" (Paul D. Allison: Logistic Regression using SAS. SAS Institute Inc. 1999, p. 44), large (absolute) estimate and standard error (and p-value), are present anyway and indicate that the affected independent variable may be problematic.&lt;/SPAN&gt;&lt;/P&gt;</description>
    <pubDate>Sat, 14 Nov 2015 19:44:38 GMT</pubDate>
    <dc:creator>FreelanceReinh</dc:creator>
    <dc:date>2015-11-14T19:44:38Z</dc:date>
    <item>
      <title>Strange with quasi-separation of data points</title>
      <link>https://communities.sas.com/t5/Statistical-Procedures/Strange-with-quasi-separation-of-data-points/m-p/234600#M12395</link>
      <description>&lt;P&gt;Hi,&lt;/P&gt;
&lt;P&gt;I have a strange issue with quasi-separation of data points in PROC LOGISTIC.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;In my dataset I have variable Y (outcome) and X. When I run the model Y=X , SAS tells me that I have "Quasi-complete separation of data points detected" I am not surprised since the pattern in the dataset looks like this:&lt;/P&gt;
&lt;P&gt;Y=0 X=1 (n=13)&lt;/P&gt;
&lt;P&gt;Y=0 X=0 (n=288)&lt;/P&gt;
&lt;P&gt;Y=1 X=0 (n=106)&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Now to the issue:&lt;/P&gt;
&lt;P&gt;If I change one value in the dataset so the dataset look like this (pattern not changed)&lt;/P&gt;
&lt;P&gt;Y=0 X=1&lt;STRONG&gt; (n=12)&lt;/STRONG&gt;&lt;/P&gt;
&lt;P&gt;Y=0 X=0&lt;STRONG&gt; (n=289)&lt;/STRONG&gt;&lt;/P&gt;
&lt;P&gt;Y=1 X=0 (n=106)&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Now, sas won't give me any warning about quasi-separation..&lt;/P&gt;
&lt;P&gt;Anyone have any idea whay it is like this. I think I still have quasi-separation?&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;I attach a txt-file with 3 variables Y X1 (before changing the value) and X2 (after changning the value)&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Thanks in advance!&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Thomas&lt;/P&gt;</description>
      <pubDate>Fri, 13 Nov 2015 13:31:49 GMT</pubDate>
      <guid>https://communities.sas.com/t5/Statistical-Procedures/Strange-with-quasi-separation-of-data-points/m-p/234600#M12395</guid>
      <dc:creator>bollibompa</dc:creator>
      <dc:date>2015-11-13T13:31:49Z</dc:date>
    </item>
    <item>
      <title>Re: Strange with quasi-separation of data points</title>
      <link>https://communities.sas.com/t5/Statistical-Procedures/Strange-with-quasi-separation-of-data-points/m-p/234775#M12402</link>
      <description>&lt;P&gt;Hi Thomas,&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;First of all, there is a minor discrepancy between the attached data and the&amp;nbsp;frequency counts you provide: Only after deleting observations 14, 15 and 16 (which look a bit misplaced) my frequency counts match yours.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;But this doesn't change the obvious fact that there is, in fact, a quasi-complete separation of data points in each of the four cases (models "Y=X1" and "Y=X2" with or without the above data correction): The minimum value of X in the subset of data points (X, Y) with Y=0 equals the maximum&amp;nbsp;&lt;SPAN&gt;value of X in the subset of data points (X, Y) with Y=1. Both are zero.&lt;/SPAN&gt;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&lt;SPAN&gt;The difference between the two scenarios (X1 vs. X2) is just that for X2 the iterative process used to compute the maximum likelihood estimates appears to converge: The convergence criterion is met -- in spite of the quasi-complete separation.&amp;nbsp;This is documented in the output where it says (under "Model Convergence Status"): "Convergence criterion (GCONV=1E-8) satisfied."&lt;BR /&gt;&lt;BR /&gt;&lt;/SPAN&gt;&lt;/P&gt;
&lt;P&gt;&lt;SPAN&gt;That 1E-8 is the default setting of the GCONV= option. If you tighten the convergence criterion only a little bit -- to GCONV=0.92E-8 or less in this example --, it will no longer be met and you'll get the familiar warning about quasi-complete separation also for X2:&lt;/SPAN&gt;&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;proc logistic data=test desc;
model y=x2 / gconv=0.92e-8;
run;
&lt;/CODE&gt;&lt;/PRE&gt;
&lt;P&gt;&lt;SPAN&gt;With or without that warning, the "telltale signs of quasi-complete separation" (Paul D. Allison: Logistic Regression using SAS. SAS Institute Inc. 1999, p. 44), large (absolute) estimate and standard error (and p-value), are present anyway and indicate that the affected independent variable may be problematic.&lt;/SPAN&gt;&lt;/P&gt;</description>
      <pubDate>Sat, 14 Nov 2015 19:44:38 GMT</pubDate>
      <guid>https://communities.sas.com/t5/Statistical-Procedures/Strange-with-quasi-separation-of-data-points/m-p/234775#M12402</guid>
      <dc:creator>FreelanceReinh</dc:creator>
      <dc:date>2015-11-14T19:44:38Z</dc:date>
    </item>
    <item>
      <title>Re: Strange with quasi-separation of data points</title>
      <link>https://communities.sas.com/t5/Statistical-Procedures/Strange-with-quasi-separation-of-data-points/m-p/235433#M12457</link>
      <description>Many thanks for your detailed description! It helped me a lot!&lt;BR /&gt;&lt;BR /&gt;/Thomas</description>
      <pubDate>Thu, 19 Nov 2015 09:30:14 GMT</pubDate>
      <guid>https://communities.sas.com/t5/Statistical-Procedures/Strange-with-quasi-separation-of-data-points/m-p/235433#M12457</guid>
      <dc:creator>bollibompa</dc:creator>
      <dc:date>2015-11-19T09:30:14Z</dc:date>
    </item>
  </channel>
</rss>

