I am trying to run multiple imputation using PROC MI using fully conditional specification (FCS) logistic regression for some binary and ordinal variables (all variables are categorical except for a weights variable that is complete). Although the MI seems to work and I get numbers that make sense, I keep getting the warning message below for many of the variables. What does this warning mean? Any suggestion on how to rememedy would be greatly appreciated! (I am using SAS 9.4).
WARNING: The maximum likelihood estimates for the logistic regression with observed observations may not exist for variable XYZ.The posterior predictive distribution of the parameters used in the imputation process is based on the maximum likelihood estimates in the last maximum likelihood iteration.
Thanks for all of your suggestions. The cause was indeed quasi-seperation due to 0 cells. However, I managed to find some solutions without deleting the offending variable(s).
A couple of solutions:
1. This was complex sample survey data being imputed. The offending variables were a combined PSU/Statum variable and location variable used to impute data (along with other variables in the model). Multiple datasets were combined from different geographic locations and so there were many 0 cells when looking at PSUstratum*location, as strata and PSU were specific to each geographic region. The solution was to delete the location variable in the imputation and just use it at the analysis stage, since the PSU/stratum already took location into account.
2. Despite fixing the PSU/Stratum variable, I still got the MLE warning for a binary variable with a rare outcome. The solution to this was to just swich from FCS logistic to FCS discim and use /classeffects=include, to include the classification variables in the imputation.
This generally happen because you have input variable that make "perfect classification". So if you removed this variable you will get rid of this warnning. You can run PROC FREQ for each input variable and you will find this variable easily.
I'm guessing that Mohamed was thinking about creating a two-way classification table and seeing if there are any zero cells. For example, if all the Y=0 are associated with Sex="Female", then there is a problem. The PROC FREQ code would look something like this:
PROC FREQ;
tables Y*Sex;
run;
My reckoning is your variable XYZ is category variable and have lots and lots of levels which split your data into very sparse ,which could render MLE not exist .
Thanks for all of your suggestions. The cause was indeed quasi-seperation due to 0 cells. However, I managed to find some solutions without deleting the offending variable(s).
A couple of solutions:
1. This was complex sample survey data being imputed. The offending variables were a combined PSU/Statum variable and location variable used to impute data (along with other variables in the model). Multiple datasets were combined from different geographic locations and so there were many 0 cells when looking at PSUstratum*location, as strata and PSU were specific to each geographic region. The solution was to delete the location variable in the imputation and just use it at the analysis stage, since the PSU/stratum already took location into account.
2. Despite fixing the PSU/Stratum variable, I still got the MLE warning for a binary variable with a rare outcome. The solution to this was to just swich from FCS logistic to FCS discim and use /classeffects=include, to include the classification variables in the imputation.
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.