Solved: Re: Code to get the pooled c-statistic and 95% CI after multiple imput...

dwhitney · Posted 06-07-2024 03:58 PM

I am running PROC MI for multiple imputation for a 5-level categorical variable, "gmfcs_final", which is the only variable in the dataset with missing values. The imputation phase works great (code under "STEP 1").

I then do the 2nd analysis phase (code under "STEP 2"). This analysis is focused purely on estimate the c-statistic with 95% CI for several models based on various covariate sets. This runs well and I output the c-stat and CI in the "auc_2" output file.

For the 3rd step, I am unable to figure out what code to run to pool the c-stats and CI appropriately (Rubin's rules?). There seems to be code to get parameter estimates in the PROC MIANALYZE, but that is not the interest of this study. Any idea on the code using the PROC MIANALYZE or other code to get the appropriately pooled c-stats and CI?

STEP 1

proc mi data=b seed=1305417 nimpute=65 out=mi_fcs;

class gmfcs_final sex race ethnicity smoking_num ins yr_start WCI_score_1cl W1-W25 base_2-base_19 fx1_base fx2_base fx3_base

fu_2_5yr_censrsn fu_3_5yr_censrsn fu_4_5yr_censrsn fu_5_5yr_censrsn fu_6_5yr_censrsn

fu_7_5yr_censrsn fu_8_5yr_censrsn fu_9_5yr_censrsn fu_10_5yr_censrsn fu_11_5yr_censrsn fu_12_5yr_censrsn

fu_13_5yr_censrsn fu_14_5yr_censrsn fu_15_5yr_censrsn fu_16_5yr_censrsn fu_17_5yr_censrsn fu_18_5yr_censrsn

fu_19_5yr_censrsn fu_fx1_5yr_censrsn fu_fx2_5yr_censrsn fu_fx3_5yr_censrsn death_5yr_censrsn;

var gmfcs_final age sex race ethnicity smoking_num ins yr_start WCI_score_1cl W1-W25 base_2-base_19 fx1_base fx2_base fx3_base

fu_2_5yr_censrsn fu_3_5yr_censrsn fu_4_5yr_censrsn fu_5_5yr_censrsn fu_6_5yr_censrsn

fu_7_5yr_censrsn fu_8_5yr_censrsn fu_9_5yr_censrsn fu_10_5yr_censrsn fu_11_5yr_censrsn fu_12_5yr_censrsn

fu_13_5yr_censrsn fu_14_5yr_censrsn fu_15_5yr_censrsn fu_16_5yr_censrsn fu_17_5yr_censrsn fu_18_5yr_censrsn

fu_19_5yr_censrsn fu_fx1_5yr_censrsn fu_fx2_5yr_censrsn fu_fx3_5yr_censrsn death_5yr_censrsn;

fcs discrim(gmfcs_final = age sex race ethnicity smoking_num ins yr_start WCI_score_1cl W1-W25 base_2-base_19 fx1_base fx2_base fx3_base

fu_2_5yr_censrsn fu_3_5yr_censrsn fu_4_5yr_censrsn fu_5_5yr_censrsn fu_6_5yr_censrsn

fu_7_5yr_censrsn fu_8_5yr_censrsn fu_9_5yr_censrsn fu_10_5yr_censrsn fu_11_5yr_censrsn fu_12_5yr_censrsn

fu_13_5yr_censrsn fu_14_5yr_censrsn fu_15_5yr_censrsn fu_16_5yr_censrsn fu_17_5yr_censrsn fu_18_5yr_censrsn

fu_19_5yr_censrsn fu_fx1_5yr_censrsn fu_fx2_5yr_censrsn fu_fx3_5yr_censrsn death_5yr_censrsn /classeffects=include) nbiter=100;

run;

STEP 2

proc logistic data=mi_fcs plots(only)=roc;
class sex race3 smoking_num ins2 yr_start_cat W24 W25 WCI_score_1cl gmfcs_final;
model fu_2_5yr(event='1')=age sex race3 smoking_num ins2 yr_start_cat WCI_score_1cl gmfcs_final / nofit;
roc 'Base model' age sex race3 smoking_num ins2 yr_start_cat;
roc 'GMFCS only' gmfcs_final;
roc 'WCI only' WCI_score_1cl;
roc 'Base+GMFCS' gmfcs_final age sex race3 smoking_num ins2 yr_start_cat;
roc 'Base+WCI' WCI_score_1cl age sex race3 smoking_num ins2 yr_start_cat;
ods output rocassociation=auc_2; by _imputation_; run;

Ksharp · Posted 06-09-2024 02:22 AM

I noticed that you already could get the STD of C statistic in PROC LOGISTIC's output.

You just feed it into PROC MIANALZE and get the pooled C statistic and CI . a.ka. no need to use BootStrap method.

proc logistic data=sashelp.class;
model sex=weight height/rocci ;
run;

View solution in original post

Ksharp · Posted 06-08-2024 02:05 AM

Use BootStrap Method ?
Check Rick's blogs:

https://blogs.sas.com/content/iml/2016/08/10/bootstrap-confidence-interval-sas.html
https://blogs.sas.com/content/iml/2018/06/20/bootstrap-method-example-sas.html
https://blogs.sas.com/content/iml/2018/07/23/boot-and-bootci-macros-sas.html
https://blogs.sas.com/content/iml/2022/05/23/balanced-bootstrap-sas.html

dwhitney · Posted 06-08-2024 01:28 PM

Thank you for the response, Ksharp. Unfortunately, I don't think bootstrapping will work. I don't know the details, but I believe the 3rd step in MI (pooling the estimates) uses between- and within-variance matrices...? Below is a copy and paste from the SAS documentation. The last sentence is what I am basing my response, and hesitation to bootstrap, on.

"The MIANALYZE procedure reads parameter estimates and associated standard errors or covariance matrix that are computed by the standard statistical procedure for each imputed data set. The MIANALYZE procedure then derives valid univariate inference for these parameters. With an additional assumption about the population between and within imputation covariance matrices, multivariate inference based on Wald tests can also be derived."

Ksharp · Posted 06-08-2024 09:28 PM

" associated standard errors "
Once you got STD of estimated parameter , you could feed it into PROC MIANALYZE and get pooled value.
I mean you could get this STD by BOOTSTRAP method.
About the "last sentence", if your data is big enough I think BOOTSTRAP would be "based on Wald tests".

@Rick_SAS would know more details.

Season · Posted 06-09-2024 12:33 AM

@dwhitney wrote:

I am running PROC MI for multiple imputation for a 5-level categorical variable, "gmfcs_final", which is the only variable in the dataset with missing values. The imputation phase works great (code under "STEP 1").

I then do the 2nd analysis phase (code under "STEP 2"). This analysis is focused purely on estimate the c-statistic with 95% CI for several models based on various covariate sets. This runs well and I output the c-stat and CI in the "auc_2" output file.

For the 3rd step, I am unable to figure out what code to run to pool the c-stats and CI appropriately (Rubin's rules?). There seems to be code to get parameter estimates in the PROC MIANALYZE, but that is not the interest of this study. Any idea on the code using the PROC MIANALYZE or other code to get the appropriately pooled c-stats and CI?

Yes, you can use PROC MIANALYZE (i.e., Rubin's rule) to pool the c-statistics and get the corresponding 95% CI of the pooled c-statistic. The theoretical foundation of this practice is that asymtotic normality of c-statistic holds.

As for Bootstrap, unfortunately, Bootstrap can be of some use in the imputation process, but is not used for pooling the results generated in each imputed dataset.

dwhitney · Posted 06-09-2024 01:31 PM

Thank you, @Ksharp and @Season! This worked out.

Ksharp · Posted 06-09-2024 02:22 AM

I noticed that you already could get the STD of C statistic in PROC LOGISTIC's output.

You just feed it into PROC MIANALZE and get the pooled C statistic and CI . a.ka. no need to use BootStrap method.

proc logistic data=sashelp.class;
model sex=weight height/rocci ;
run;

Code to get the pooled c-statistic and 95% CI after multiple imputation

Re: Code to get the pooled c-statistic and 95% CI after multiple imputation

Re: Code to get the pooled c-statistic and 95% CI after multiple imputation

Re: Code to get the pooled c-statistic and 95% CI after multiple imputation

Re: Code to get the pooled c-statistic and 95% CI after multiple imputation

Re: Code to get the pooled c-statistic and 95% CI after multiple imputation

Re: Code to get the pooled c-statistic and 95% CI after multiple imputation

Re: Code to get the pooled c-statistic and 95% CI after multiple imputation

SAS Innovate 2025: Save the Date

SAS Training: Just a Click Away