BookmarkSubscribeRSS Feed
☑ This topic is solved. Need further help from the community? Please sign in and ask a new question.
Kelly_K
Fluorite | Level 6

Hi, everyone.

 

I'm using the ASSESS statement in PROC PSMATCH to generate diagnostic tools for my propensity score matches.  All of my categorical variables are binary, but SAS continues to insist that two of them are not and drops them from the ASSESS statement. 

 

Screen Shot 2022-06-21 at 9.03.18 PM.png

I've tripled checked these variables and they are, in fact, binary. 

 

Screen Shot 2022-06-21 at 9.03.57 PM.png

I will note that I created these as dummy variables using the PROC GLMSELECT procedure described here.  However, none of the other 20+ dummies that were created with this procedure trigger this warning.

 

Any thoughts on why I'm getting this result?

 

Many thanks for your help.

 

(FYI:  This is the same question as the one in this post, but the responses seem to assume that the variable generating the warning was not in fact binary.  There isn't an answer (or I'm not seeing it) about what to do if the variable is, in fact, binary.)

 

 

1 ACCEPTED SOLUTION

Accepted Solutions
MichaelL_SAS
SAS Employee

I suspect the issue is due to missing values in the response or other covariates. PROC PSMATCH only uses the observations with non-missing values, and based on the PROC LOGISTIC output you provided I suspect that the 116 observations with resp_never_married=1 and the 41 observations with resp_sep_div_wid=1 all have a missing value for some other covariate and are ultimately excluded. In that case PROC PSMATCH detects only 1 level for those variables, not exactly 2, and therefore issues that warning and excludes them from the ASSESS statement output. You might try just using that 0/1 coding treating them as continuous inputs and not listing them in the CLASS statement. 

 

EDIT: I realized my comment about treating those variables as continuous is a bit silly. If my assumption is correct, that means they are constant, so the values are the same in each treatment condition and they are trivially balanced. 

View solution in original post

4 REPLIES 4
SteveDenham
Jade | Level 19

You have three responses for resp_never_married - missing, 0 and 1, so the program assumes that the variable is not binary.  Try removing the missing response records and see it that improves things.

 

SteveDenham

Kelly_K
Fluorite | Level 6

Hi, Steve. Thanks for the quick reply. I tried your suggestion but the problem persists.  Here is a screenshot of the warning message I received after I dropped the missing observations for the marital variable.

Screen Shot 2022-06-22 at 9.08.10 AM.png

Screen Shot 2022-06-22 at 9.16.14 AM.png

I will also note that most of my other nonbinary categorical variables have missing values as well but they don't trigger the same problem. For example:

Screen Shot 2022-06-22 at 9.19.48 AM.png

Perhaps this is another clue:  Because I received the warning that "the maximum likelihood estimates for the logistic regression model might not exist" while running the PROC PSMATCH procedure, I generated my propensity scores using the PROC LOGISTIC procedure.  I noticed that the output differed for the marital variables compared to the other nonbinary categorical variables:

Screen Shot 2022-06-22 at 9.29.10 AM.png

It looks like SAS is not seeing the obs that are coded as "1" for the marital variables.  Does this provide any insight into what is going on?

Kelly

 

 

 

 

MichaelL_SAS
SAS Employee

I suspect the issue is due to missing values in the response or other covariates. PROC PSMATCH only uses the observations with non-missing values, and based on the PROC LOGISTIC output you provided I suspect that the 116 observations with resp_never_married=1 and the 41 observations with resp_sep_div_wid=1 all have a missing value for some other covariate and are ultimately excluded. In that case PROC PSMATCH detects only 1 level for those variables, not exactly 2, and therefore issues that warning and excludes them from the ASSESS statement output. You might try just using that 0/1 coding treating them as continuous inputs and not listing them in the CLASS statement. 

 

EDIT: I realized my comment about treating those variables as continuous is a bit silly. If my assumption is correct, that means they are constant, so the values are the same in each treatment condition and they are trivially balanced. 

Kelly_K
Fluorite | Level 6

Hi Michael and Steve.  Your focus on missing values helped me figure out the problem with the marital status dummies not being recognized as binary by SAS.  One of the covariates I included in the logistic regression model used to generate the propensity scores was spouse/partner's employment.  The inclusion of that variable meant that the values of respondent's marital status variable had to be, by definition, "married" (which was confirmed by the fact that "is your spouse employed" question was asked only if a person answered they were married/had a partner to the marital status question).   This in turn caused the marital status dummies to lose their variability -- all obs showed a value of 1 for the married status dummies and a value of 0 for the other marital status dummies.  I was able to resolve the problem by eliminating the spouse employed variable, thus retaining the variance in the marital status dummies.  

 

Hope that makes sense.   Thanks for your help!

Kelly

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

What is ANOVA?

ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 4 replies
  • 1519 views
  • 3 likes
  • 3 in conversation