BookmarkSubscribeRSS Feed
wcw2
Obsidian | Level 7

I'm running a model in Proc Logistic, modeling the probability of a negative culture (Y/N) with the dichotomous predictors drug (Y/N) and disease severity (Y/N). I also need to include study site (34 of these and many are sparsely populated) as it's a confounder. However, when I do, the model falls apart ("Quasi-complete separation of data points detected...WARNING: The maximum likelihood estimate may not exist....WARNING: The validity of the model fit is questionable."), I guess because there are so many sites. How do I approach this problem? Should I group the sites into several chunks? I don't often run multivariate models. Thank you.

3 REPLIES 3
PaigeMiller
Diamond | Level 26

Generally very sparse predictor variables are indeed a problem. You could group the sites, if there is a meaningful way to do such a grouping. Or you could try to find some continuous variable that might represent the sites. 

--
Paige Miller
wcw2
Obsidian | Level 7

OK, thanks. Yes, my plan is to just group them. Most of the population is African sites, so will try Africa/non-Africa groups.

StatDave
SAS Super FREQ

You could fit a conditional logistic model by stratifying on the sites by using the STRATA statement. Doing this will remove the need to estimate the separate parameters for the sites. See the conditional logistic example in the PROC LOGISTIC documentation. If you need to estimate the site parameters, you could try using the penalized likelihood method by adding the FIRTH option. Another possibility is exact estimation, but this is very resource intensive and might not be feasible.

sas-innovate-2026-white.png



April 27 – 30 | Gaylord Texan | Grapevine, Texas

Registration is open

Walk in ready to learn. Walk out ready to deliver. This is the data and AI conference you can't afford to miss.
Register now and lock in 2025 pricing—just $495!

Register now

What is ANOVA?

ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 3 replies
  • 1157 views
  • 0 likes
  • 3 in conversation