Hi, everyone.
I'm using the ASSESS statement in PROC PSMATCH to generate diagnostic tools for my propensity score matches. All of my categorical variables are binary, but SAS continues to insist that two of them are not and drops them from the ASSESS statement.
I've tripled checked these variables and they are, in fact, binary.
I will note that I created these as dummy variables using the PROC GLMSELECT procedure described here. However, none of the other 20+ dummies that were created with this procedure trigger this warning.
Any thoughts on why I'm getting this result?
Many thanks for your help.
(FYI: This is the same question as the one in this post, but the responses seem to assume that the variable generating the warning was not in fact binary. There isn't an answer (or I'm not seeing it) about what to do if the variable is, in fact, binary.)
I suspect the issue is due to missing values in the response or other covariates. PROC PSMATCH only uses the observations with non-missing values, and based on the PROC LOGISTIC output you provided I suspect that the 116 observations with resp_never_married=1 and the 41 observations with resp_sep_div_wid=1 all have a missing value for some other covariate and are ultimately excluded. In that case PROC PSMATCH detects only 1 level for those variables, not exactly 2, and therefore issues that warning and excludes them from the ASSESS statement output. You might try just using that 0/1 coding treating them as continuous inputs and not listing them in the CLASS statement.
EDIT: I realized my comment about treating those variables as continuous is a bit silly. If my assumption is correct, that means they are constant, so the values are the same in each treatment condition and they are trivially balanced.
You have three responses for resp_never_married - missing, 0 and 1, so the program assumes that the variable is not binary. Try removing the missing response records and see it that improves things.
SteveDenham
Hi, Steve. Thanks for the quick reply. I tried your suggestion but the problem persists. Here is a screenshot of the warning message I received after I dropped the missing observations for the marital variable.
I will also note that most of my other nonbinary categorical variables have missing values as well but they don't trigger the same problem. For example:
Perhaps this is another clue: Because I received the warning that "the maximum likelihood estimates for the logistic regression model might not exist" while running the PROC PSMATCH procedure, I generated my propensity scores using the PROC LOGISTIC procedure. I noticed that the output differed for the marital variables compared to the other nonbinary categorical variables:
It looks like SAS is not seeing the obs that are coded as "1" for the marital variables. Does this provide any insight into what is going on?
Kelly
I suspect the issue is due to missing values in the response or other covariates. PROC PSMATCH only uses the observations with non-missing values, and based on the PROC LOGISTIC output you provided I suspect that the 116 observations with resp_never_married=1 and the 41 observations with resp_sep_div_wid=1 all have a missing value for some other covariate and are ultimately excluded. In that case PROC PSMATCH detects only 1 level for those variables, not exactly 2, and therefore issues that warning and excludes them from the ASSESS statement output. You might try just using that 0/1 coding treating them as continuous inputs and not listing them in the CLASS statement.
EDIT: I realized my comment about treating those variables as continuous is a bit silly. If my assumption is correct, that means they are constant, so the values are the same in each treatment condition and they are trivially balanced.
Hi Michael and Steve. Your focus on missing values helped me figure out the problem with the marital status dummies not being recognized as binary by SAS. One of the covariates I included in the logistic regression model used to generate the propensity scores was spouse/partner's employment. The inclusion of that variable meant that the values of respondent's marital status variable had to be, by definition, "married" (which was confirmed by the fact that "is your spouse employed" question was asked only if a person answered they were married/had a partner to the marital status question). This in turn caused the marital status dummies to lose their variability -- all obs showed a value of 1 for the married status dummies and a value of 0 for the other marital status dummies. I was able to resolve the problem by eliminating the spouse employed variable, thus retaining the variance in the marital status dummies.
Hope that makes sense. Thanks for your help!
Kelly
Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!
ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.
Find more tutorials on the SAS Users YouTube channel.