Hi! I'm really stuck... And after searching for an answer for many days now, I'm going to post my first ever question on the SAS boards. I'm going to give a lot of background before I ask my questions. Here we go...
Firstly, I'm using SAS Version 9.4.
My logistic regression model consists of a binary outcome (ever vs. never), a binary main predictor ("X1"; also ever vs. never), several binary covariates, one ordered categorical covariate (age), and one non-ordered categorical covariate (race). (I should note that I originally used a model with age as a continuous predictor, but it did not pass the scale-check as a continuous variable, so I categorized it and everything checked out for it to be categorical.)
I began a model building process with a non-imputed dataset and followed Hosmer & Lemeshow's "purposeful selection" model building process from Applied Logistic Regression, which specifies to use a Likelihood Ratio Test - not Wald p-values -- to determine significance of interaction terms. During this process I identified that there was a significant interaction with X1 (X1*X5, with non-missing data), which I included in the final model. The final model passed the H&L Goodness of Fit test.
There are 4 variables with missing data in my data set, three that have 1-2% missing data each, but one of the binary predictors --X3 -- has 12% missing data. Moreover, a literature review in this field reveals that the outcome is quite often stratified by X3 due to a significant interaction. X3 was not, however, identified as an interaction term in my non-imputed model building process. I have a hunch that the reason it was not identified as an interaction is due to the fact that it is missing so many data points... So I decided to use multiple imputation for the first time! Yay for the unknown! I have read A LOT on the topic, but am still a beginner at it.
Many have noted that it is important to include interaction terms in a multiple imputation model, so I included the known interaction (X1*binarycovariate with non-missing data) and the suspected interaction (X1*X3 with missing data) in the "PROC MI" statement. The "FCS LOGISTIC" option in "PROC MI" would not allow me to "impute" an interaction term in the "X1*X3" format (like interactions are specified in "PROC LOGISTIC"). So, I created a new variable "X1intX3" by multiplying X1 by X3, taking care to leave missing data where X3 is missing. Syntax below:
title1 'imputation phase';
Hello,
Did you finally get an answer to your question 2?
Did you finally stratify your interaction terms?
I'm having the same problem trying to stratify my OR by level of the third variable.
Please if you found a way could you please share the syntax?
Sincerely,
Manuel
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.
Find more tutorials on the SAS Users YouTube channel.