BookmarkSubscribeRSS Feed
☑ This topic is solved. Need further help from the community? Please sign in and ask a new question.
dwhitney
Calcite | Level 5

I am running PROC MI for multiple imputation for a 5-level categorical variable, "gmfcs_final", which is the only variable in the dataset with missing values. The imputation phase works great (code under "STEP 1").

 

I then do the 2nd analysis phase (code under "STEP 2"). This analysis is focused purely on estimate the c-statistic with 95% CI for several models based on various covariate sets. This runs well and I output the c-stat and CI in the "auc_2" output file. 

 

For the 3rd step, I am unable to figure out what code to run to pool the c-stats and CI appropriately (Rubin's rules?). There seems to be code to get parameter estimates in the PROC MIANALYZE, but that is not the interest of this study. Any idea on the code using the PROC MIANALYZE  or other code to get the appropriately pooled c-stats and CI?

 

STEP 1

proc mi data=b seed=1305417 nimpute=65 out=mi_fcs;
class gmfcs_final sex race ethnicity smoking_num ins yr_start WCI_score_1cl W1-W25 base_2-base_19 fx1_base fx2_base fx3_base 
fu_2_5yr_censrsn fu_3_5yr_censrsn fu_4_5yr_censrsn fu_5_5yr_censrsn fu_6_5yr_censrsn
fu_7_5yr_censrsn fu_8_5yr_censrsn fu_9_5yr_censrsn fu_10_5yr_censrsn fu_11_5yr_censrsn fu_12_5yr_censrsn 
fu_13_5yr_censrsn fu_14_5yr_censrsn fu_15_5yr_censrsn fu_16_5yr_censrsn fu_17_5yr_censrsn fu_18_5yr_censrsn 
fu_19_5yr_censrsn fu_fx1_5yr_censrsn fu_fx2_5yr_censrsn fu_fx3_5yr_censrsn death_5yr_censrsn;
var gmfcs_final age sex race ethnicity smoking_num ins yr_start WCI_score_1cl W1-W25 base_2-base_19 fx1_base fx2_base fx3_base 
fu_2_5yr_censrsn fu_3_5yr_censrsn fu_4_5yr_censrsn fu_5_5yr_censrsn fu_6_5yr_censrsn
fu_7_5yr_censrsn fu_8_5yr_censrsn fu_9_5yr_censrsn fu_10_5yr_censrsn fu_11_5yr_censrsn fu_12_5yr_censrsn 
fu_13_5yr_censrsn fu_14_5yr_censrsn fu_15_5yr_censrsn fu_16_5yr_censrsn fu_17_5yr_censrsn fu_18_5yr_censrsn 
fu_19_5yr_censrsn fu_fx1_5yr_censrsn fu_fx2_5yr_censrsn fu_fx3_5yr_censrsn death_5yr_censrsn;
fcs discrim(gmfcs_final = age sex race ethnicity smoking_num ins yr_start WCI_score_1cl W1-W25 base_2-base_19 fx1_base fx2_base fx3_base 
fu_2_5yr_censrsn fu_3_5yr_censrsn fu_4_5yr_censrsn fu_5_5yr_censrsn fu_6_5yr_censrsn
fu_7_5yr_censrsn fu_8_5yr_censrsn fu_9_5yr_censrsn fu_10_5yr_censrsn fu_11_5yr_censrsn fu_12_5yr_censrsn 
fu_13_5yr_censrsn fu_14_5yr_censrsn fu_15_5yr_censrsn fu_16_5yr_censrsn fu_17_5yr_censrsn fu_18_5yr_censrsn 
fu_19_5yr_censrsn fu_fx1_5yr_censrsn fu_fx2_5yr_censrsn fu_fx3_5yr_censrsn death_5yr_censrsn /classeffects=include) nbiter=100; 
run;
 
STEP 2
proc logistic data=mi_fcs plots(only)=roc;
class sex race3 smoking_num ins2 yr_start_cat W24 W25 WCI_score_1cl gmfcs_final;
model fu_2_5yr(event='1')=age sex race3 smoking_num ins2 yr_start_cat WCI_score_1cl gmfcs_final / nofit;
roc 'Base model' age sex race3 smoking_num ins2 yr_start_cat;
roc 'GMFCS only' gmfcs_final;
roc 'WCI only' WCI_score_1cl;
roc 'Base+GMFCS' gmfcs_final age sex race3 smoking_num ins2 yr_start_cat;
roc 'Base+WCI' WCI_score_1cl age sex race3 smoking_num ins2 yr_start_cat;
ods output rocassociation=auc_2; by _imputation_; run;
1 ACCEPTED SOLUTION

Accepted Solutions
Ksharp
Super User

I noticed that you already could get the STD of C statistic in PROC LOGISTIC's output.

You just feed it into PROC MIANALZE and get the pooled C statistic  and CI . a.ka. no need to use BootStrap method.

 

proc logistic data=sashelp.class;
model sex=weight height/rocci ;
run;

Ksharp_0-1717914146490.png

 

View solution in original post

6 REPLIES 6
dwhitney
Calcite | Level 5

Thank you for the response, Ksharp. Unfortunately, I don't think bootstrapping will work. I don't know the details, but I believe the 3rd step in MI (pooling the estimates) uses between- and within-variance matrices...? Below is a copy and paste from the SAS documentation. The last sentence is what I am basing my response, and hesitation to bootstrap, on.

 

"The MIANALYZE procedure reads parameter estimates and associated standard errors or covariance matrix that are computed by the standard statistical procedure for each imputed data set. The MIANALYZE procedure then derives valid univariate inference for these parameters. With an additional assumption about the population between and within imputation covariance matrices, multivariate inference based on Wald tests can also be derived."

Ksharp
Super User
" associated standard errors "
Once you got STD of estimated parameter , you could feed it into PROC MIANALYZE and get pooled value.
I mean you could get this STD by BOOTSTRAP method.
About the "last sentence", if your data is big enough I think BOOTSTRAP would be "based on Wald tests".

@Rick_SAS would know more details.
Season
Lapis Lazuli | Level 10

@dwhitney wrote:

I am running PROC MI for multiple imputation for a 5-level categorical variable, "gmfcs_final", which is the only variable in the dataset with missing values. The imputation phase works great (code under "STEP 1").

 

I then do the 2nd analysis phase (code under "STEP 2"). This analysis is focused purely on estimate the c-statistic with 95% CI for several models based on various covariate sets. This runs well and I output the c-stat and CI in the "auc_2" output file. 

 

For the 3rd step, I am unable to figure out what code to run to pool the c-stats and CI appropriately (Rubin's rules?). There seems to be code to get parameter estimates in the PROC MIANALYZE, but that is not the interest of this study. Any idea on the code using the PROC MIANALYZE  or other code to get the appropriately pooled c-stats and CI?


Yes, you can use PROC MIANALYZE (i.e., Rubin's rule) to pool the c-statistics and get the corresponding 95% CI of the pooled c-statistic. The theoretical foundation of this practice is that asymtotic normality of c-statistic holds.

As for Bootstrap, unfortunately, Bootstrap can be of some use in the imputation process, but is not used for pooling the results generated in each imputed dataset.

dwhitney
Calcite | Level 5

Thank you, @Ksharp and @Season! This worked out.

Ksharp
Super User

I noticed that you already could get the STD of C statistic in PROC LOGISTIC's output.

You just feed it into PROC MIANALZE and get the pooled C statistic  and CI . a.ka. no need to use BootStrap method.

 

proc logistic data=sashelp.class;
model sex=weight height/rocci ;
run;

Ksharp_0-1717914146490.png

 

SAS Innovate 2025: Save the Date

 SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!

Save the date!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 6 replies
  • 869 views
  • 1 like
  • 3 in conversation