08-07-2015 11:42 PM
I am using SAS 9.3 to perform multiple imputation for systolic blood pressure (SBP) through FCS predictive mean matching methods.
We are doing an analysis for a 3-arm randomized clinical study. We measured SBP at baseline and then at 6-month intervals over 2 years. In the analysis, we used Generalized Estimating Equation to analyze baseline and 2 year SBP. The SBP (baseline, 6 m, 12m,18m and 24m) displayed an arbitrary missing pattern(no missing SBP at baseline). Because only baseline and 2 year SBP were analyzed, only 2 year SBP was imputed with FCS predictive mean matching methods. The remaining variables were imputed by SAS default methods.
The variables included in the imputation model for imputing 2 year SBP (sbp_24m) were:
Continuous variables: SBP at baseline, 6 m, 12m,18m (sbp0, sbp_6m ,sbp_12m, sbp_18m ); age, BMI, physical activity score (mets_0)
categorical variables: intervention group (3 categories: treat), gender (binary), education(binary), smoke(binary), diabetes(binary), antihypertensive use(binary: antihyp)
Variables with missing data : sbp_6m sbp_12m sbp_18m sbp_24m, bmi, diabetes.
my codes are:
PROC mi data=WORK seed=20150805 nimpute=20 out=mi_WORK;
class gender education antihyp smoke diabetes treat ;
var sbp_24m sbp0 sbp_6m sbp_12m sbp_18m age gender education treat bmi smoke METS_0 antihyp Diabetes ;
FCS nbiter=20 regpmm(sbp_24m=sbp0 sbp_6m sbp_12m sbp_18m age gender education treat bmi smoke METS_0 antihyp Diabetes);
The model was run successfully. However, I found that if I changed the order of the variables in the "var" statement (except sbp_24m) or the predictors in regression model of "regpmm" statement, I got different sets of imputed data for SBP_24m. The pooling results from GEE model were thus different each time I changed the order of the variables. In the 'var' statement, sbp_24m is in the first place. I want to impute this variable first and use observed values of other variables to do the imputation. In the "regpmm" state, since this is a regression model, I think the order of the predictors in the model should not affect the imputation results.
Can you tell me why?
Also, should multicollinearity be considered since I added SBP at different time points into the same model? Finally,
will the categorical variables be modeled as categorical variables in the regression model if they are placed in the
Thank you very much!