<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: Logistic regression question in Statistical Procedures</title>
    <link>https://communities.sas.com/t5/Statistical-Procedures/Logistic-regression-question/m-p/869768#M43036</link>
    <description>&lt;P&gt;Generally very sparse predictor variables are indeed a problem. You could group the sites, if there is a meaningful way to do such a grouping. Or you could try to find some continuous variable that might represent the sites.&amp;nbsp;&lt;/P&gt;</description>
    <pubDate>Fri, 14 Apr 2023 12:06:20 GMT</pubDate>
    <dc:creator>PaigeMiller</dc:creator>
    <dc:date>2023-04-14T12:06:20Z</dc:date>
    <item>
      <title>Logistic regression question</title>
      <link>https://communities.sas.com/t5/Statistical-Procedures/Logistic-regression-question/m-p/869764#M43035</link>
      <description>&lt;P&gt;I'm running a model in Proc Logistic, modeling the probability of a negative culture (Y/N) with the dichotomous predictors&amp;nbsp;drug (Y/N) and disease severity (Y/N). I also need to include study site (34 of these and many are sparsely populated) as&amp;nbsp;it's a confounder. However, when I do, the model falls apart ("Quasi-complete separation of data points detected...WARNING:&amp;nbsp;The maximum likelihood estimate may not exist....WARNING: The validity of the model fit is questionable."), I guess because&amp;nbsp;there are so many sites. How do I approach this problem? Should I group the sites into several chunks? I don't often run&amp;nbsp;multivariate models. Thank you.&lt;/P&gt;</description>
      <pubDate>Fri, 14 Apr 2023 11:56:37 GMT</pubDate>
      <guid>https://communities.sas.com/t5/Statistical-Procedures/Logistic-regression-question/m-p/869764#M43035</guid>
      <dc:creator>wcw2</dc:creator>
      <dc:date>2023-04-14T11:56:37Z</dc:date>
    </item>
    <item>
      <title>Re: Logistic regression question</title>
      <link>https://communities.sas.com/t5/Statistical-Procedures/Logistic-regression-question/m-p/869768#M43036</link>
      <description>&lt;P&gt;Generally very sparse predictor variables are indeed a problem. You could group the sites, if there is a meaningful way to do such a grouping. Or you could try to find some continuous variable that might represent the sites.&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Fri, 14 Apr 2023 12:06:20 GMT</pubDate>
      <guid>https://communities.sas.com/t5/Statistical-Procedures/Logistic-regression-question/m-p/869768#M43036</guid>
      <dc:creator>PaigeMiller</dc:creator>
      <dc:date>2023-04-14T12:06:20Z</dc:date>
    </item>
    <item>
      <title>Re: Logistic regression question</title>
      <link>https://communities.sas.com/t5/Statistical-Procedures/Logistic-regression-question/m-p/869771#M43037</link>
      <description>&lt;P&gt;OK, thanks. Yes, my plan is to just group them. Most of the population is African sites, so will try Africa/non-Africa groups.&lt;/P&gt;</description>
      <pubDate>Fri, 14 Apr 2023 12:16:43 GMT</pubDate>
      <guid>https://communities.sas.com/t5/Statistical-Procedures/Logistic-regression-question/m-p/869771#M43037</guid>
      <dc:creator>wcw2</dc:creator>
      <dc:date>2023-04-14T12:16:43Z</dc:date>
    </item>
    <item>
      <title>Re: Logistic regression question</title>
      <link>https://communities.sas.com/t5/Statistical-Procedures/Logistic-regression-question/m-p/869789#M43038</link>
      <description>&lt;P&gt;You could fit a conditional logistic model by stratifying on the sites by using the STRATA statement. Doing this will remove the need to estimate the separate parameters for the sites. See the conditional logistic example in the PROC LOGISTIC documentation. If you need to estimate the site parameters, you could try using the penalized likelihood method by adding the FIRTH option. Another possibility is exact estimation, but this is very resource intensive and might not be feasible.&lt;/P&gt;</description>
      <pubDate>Fri, 14 Apr 2023 13:32:57 GMT</pubDate>
      <guid>https://communities.sas.com/t5/Statistical-Procedures/Logistic-regression-question/m-p/869789#M43038</guid>
      <dc:creator>StatDave</dc:creator>
      <dc:date>2023-04-14T13:32:57Z</dc:date>
    </item>
  </channel>
</rss>

