Solved: Odds ratios and missing data

Wolverine · Posted 09-14-2023 10:05 AM

I'm calculating odds ratios between two binary (yes/no) variables. At some point in the process, the missing values for one of the variables were recoded to -1 and was left this way when the odds ratios were run. Fearing this may lead to inaccurate results, I recoded the -1's back to missing, ran frequencies to verify all -1's were now missing, and reran the odds ratios. The results didn't change at all.

After thinking about it, I believe the explanation is that the value of -1 would affect stats like mean and standard deviation. But for odds ratios, these variables are being treated as categorical rather than continuous because there are only 3 levels. Whether the missing cases are assigned a category ("-1") or just left as missing doesn't affect the odds for the "0" and "1" categories. Is my thinking correct, or is there something else going on here?

ballardw · Posted 09-14-2023 12:28 PM

You didn't change the relative counts of 1 vs 0. Those stayed the same. If you had changed the -1 to 0 the ratios would change.

View solution in original post

ballardw · Posted 09-14-2023 10:13 AM

Probably should provide the code that you were using and indicate the variable(s) of interest. There are several procedures that calculate odds ratios and may behave a bit differently depending on procedure and options chosen.

Also, do you have any sort of custom format applied to the variable? A variable that plays a group role could well not show a difference if the definition for one of the ranges of a custom format looked like "low - 1" or similar.

Wolverine · Posted 09-14-2023 10:22 AM

No custom formats have been applied. "SI_flag" is the variable of interest. Here is the code:

PROC LOGISTIC data=analysis_for_model; CLASS SI_flag (ref="0");
	MODEL mgmt_flag(EVENT='1')=SI_flag; 
    ODS output oddsratios=SI_flag_OR;
RUN;
QUIT;

ballardw · Posted 09-14-2023 11:05 AM

Okay nothing complex.

Did you verify that your original values of -1 actually have a value for the dependent variable? If what ever cause the -1 or original missing for SI_flag is associated with missing value for Mgmt_flag then the observations would not have been used as there was nothing for the calculation.

Check your logs for running with both sets of data and see how many observations are read and how many used by the model. If the number didn't change I strongly suspect that the dependent is missing as well.

Wolverine · Posted 09-14-2023 11:32 AM

mgmt_flag doesn't have any missing -- all values are either 0 or 1. When comparing records with missing SI_flag to mgmt_flag, sometimes mgmt_flag is 0 and sometimes it's 1.

In the file version where SI_flag is set to missing, there is a note in the log. The number is the same number of records where SI_flag is missing

Note: 5389 observations were deleted due to missing values for the response or explanatory variables.

When running the version with missing set to -1, the odds ratio output shows this as point estimate (confidence interval):

SI_flag -1 vs 0 0.968 (0.897 1.044 )
SI_flag 1 vs 0 0.742 (0.608 0.906)

When running the version recoded back to missing, the odds ratio output shows this as point estimate (confidence interval):

SI_flag 1 vs 0 0.742 (0.608 0.906)

ballardw · Posted 09-14-2023 12:28 PM

You didn't change the relative counts of 1 vs 0. Those stayed the same. If you had changed the -1 to 0 the ratios would change.

Wolverine · Posted 09-14-2023 12:39 PM

@ballardw wrote:

You didn't change the relative counts of 1 vs 0. Those stayed the same. If you had changed the -1 to 0 the ratios would change.

I think I understand now... because the relative counts didn't change, the odds of having a value of 1 vs 0 didn't change either. Thanks!

Odds ratios and missing data

Re: Odds ratios and missing data

Re: Odds ratios and missing data

Re: Odds ratios and missing data

Re: Odds ratios and missing data

Re: Odds ratios and missing data

Re: Odds ratios and missing data

Re: Odds ratios and missing data

Registration is open