- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
I am trying to run multiple imputation using PROC MI using fully conditional specification (FCS) logistic regression for some binary and ordinal variables (all variables are categorical except for a weights variable that is complete). Although the MI seems to work and I get numbers that make sense, I keep getting the warning message below for many of the variables. What does this warning mean? Any suggestion on how to rememedy would be greatly appreciated! (I am using SAS 9.4).
WARNING: The maximum likelihood estimates for the logistic regression with observed observations may not exist for variable XYZ.The posterior predictive distribution of the parameters used in the imputation process is based on the maximum likelihood estimates in the last maximum likelihood iteration.
Accepted Solutions
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Thanks for all of your suggestions. The cause was indeed quasi-seperation due to 0 cells. However, I managed to find some solutions without deleting the offending variable(s).
A couple of solutions:
1. This was complex sample survey data being imputed. The offending variables were a combined PSU/Statum variable and location variable used to impute data (along with other variables in the model). Multiple datasets were combined from different geographic locations and so there were many 0 cells when looking at PSUstratum*location, as strata and PSU were specific to each geographic region. The solution was to delete the location variable in the imputation and just use it at the analysis stage, since the PSU/stratum already took location into account.
2. Despite fixing the PSU/Stratum variable, I still got the MLE warning for a binary variable with a rare outcome. The solution to this was to just swich from FCS logistic to FCS discim and use /classeffects=include, to include the classification variables in the imputation.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
This generally happen because you have input variable that make "perfect classification". So if you removed this variable you will get rid of this warnning. You can run PROC FREQ for each input variable and you will find this variable easily.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
That makes sense, but what should I be looking for in PROC FREQ? I don’t have any variables with a row count of 0 and all of the categories in variables are pretty large with >1000 for most.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
I'm guessing that Mohamed was thinking about creating a two-way classification table and seeing if there are any zero cells. For example, if all the Y=0 are associated with Sex="Female", then there is a problem. The PROC FREQ code would look something like this:
PROC FREQ;
tables Y*Sex;
run;
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
My reckoning is your variable XYZ is category variable and have lots and lots of levels which split your data into very sparse ,which could render MLE not exist .
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Thanks for all of your suggestions. The cause was indeed quasi-seperation due to 0 cells. However, I managed to find some solutions without deleting the offending variable(s).
A couple of solutions:
1. This was complex sample survey data being imputed. The offending variables were a combined PSU/Statum variable and location variable used to impute data (along with other variables in the model). Multiple datasets were combined from different geographic locations and so there were many 0 cells when looking at PSUstratum*location, as strata and PSU were specific to each geographic region. The solution was to delete the location variable in the imputation and just use it at the analysis stage, since the PSU/stratum already took location into account.
2. Despite fixing the PSU/Stratum variable, I still got the MLE warning for a binary variable with a rare outcome. The solution to this was to just swich from FCS logistic to FCS discim and use /classeffects=include, to include the classification variables in the imputation.