I am running PROC MI for multiple imputation for a 5-level categorical variable, "gmfcs_final", which is the only variable in the dataset with missing values. The imputation phase works great (code under "STEP 1").
I then do the 2nd analysis phase (code under "STEP 2"). This analysis is focused purely on estimate the c-statistic with 95% CI for several models based on various covariate sets. This runs well and I output the c-stat and CI in the "auc_2" output file.
For the 3rd step, I am unable to figure out what code to run to pool the c-stats and CI appropriately (Rubin's rules?). There seems to be code to get parameter estimates in the PROC MIANALYZE, but that is not the interest of this study. Any idea on the code using the PROC MIANALYZE or other code to get the appropriately pooled c-stats and CI?
STEP 1
I noticed that you already could get the STD of C statistic in PROC LOGISTIC's output.
You just feed it into PROC MIANALZE and get the pooled C statistic and CI . a.ka. no need to use BootStrap method.
proc logistic data=sashelp.class;
model sex=weight height/rocci ;
run;
Thank you for the response, Ksharp. Unfortunately, I don't think bootstrapping will work. I don't know the details, but I believe the 3rd step in MI (pooling the estimates) uses between- and within-variance matrices...? Below is a copy and paste from the SAS documentation. The last sentence is what I am basing my response, and hesitation to bootstrap, on.
"The MIANALYZE procedure reads parameter estimates and associated standard errors or covariance matrix that are computed by the standard statistical procedure for each imputed data set. The MIANALYZE procedure then derives valid univariate inference for these parameters. With an additional assumption about the population between and within imputation covariance matrices, multivariate inference based on Wald tests can also be derived."
@dwhitney wrote:
I am running PROC MI for multiple imputation for a 5-level categorical variable, "gmfcs_final", which is the only variable in the dataset with missing values. The imputation phase works great (code under "STEP 1").
I then do the 2nd analysis phase (code under "STEP 2"). This analysis is focused purely on estimate the c-statistic with 95% CI for several models based on various covariate sets. This runs well and I output the c-stat and CI in the "auc_2" output file.
For the 3rd step, I am unable to figure out what code to run to pool the c-stats and CI appropriately (Rubin's rules?). There seems to be code to get parameter estimates in the PROC MIANALYZE, but that is not the interest of this study. Any idea on the code using the PROC MIANALYZE or other code to get the appropriately pooled c-stats and CI?
Yes, you can use PROC MIANALYZE (i.e., Rubin's rule) to pool the c-statistics and get the corresponding 95% CI of the pooled c-statistic. The theoretical foundation of this practice is that asymtotic normality of c-statistic holds.
As for Bootstrap, unfortunately, Bootstrap can be of some use in the imputation process, but is not used for pooling the results generated in each imputed dataset.
I noticed that you already could get the STD of C statistic in PROC LOGISTIC's output.
You just feed it into PROC MIANALZE and get the pooled C statistic and CI . a.ka. no need to use BootStrap method.
proc logistic data=sashelp.class;
model sex=weight height/rocci ;
run;
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.