Hi everyone, Currently, I'm trying my best to perform multiple imputation on my original dataset with 1000 observations with missing data. Missing data was not coded to be 99 of 999, but coded to be " " or ".". I created 20 imputed datasets using proc mi, and used proc genmod to compute parameter estimates. Then I used proc mianalyze to pool these estimates. So I followed all steps within multiple imputation. However, I now have one giant imputed dataset with 20 x 1000 = 20 thousand observations. If I run my analyses on this giant dataset, everything suddenly becomes significant due to large sample size. How do get 1 most optimal imputed dataset with 1000 observations (just like the original) from these 20 imputed datasets? I would appreciate it if someone could help me out! Here you find my syntax. If you have any further feedback on it or tips for me, please let me know. proc mi data=cko.blazib nimpute=20 seed=54321 out=cko.mi1_blazib;
class nac gesl ses_cbs stage histo ps cci;
var nac gesl leeft ses_cbs stage histo ps cci ckd_epi bmi;
fcs logistic (ps = nac gesl leeft ses_cbs stage histo cci ckd_epi bmi / link=logit) nbiter =200 ;
fcs logistic (cci = nac gesl leeft ses_cbs stage histo ps ckd_epi bmi / link=logit) nbiter =200 ;
fcs logistic (ses_cbs = nac gesl leeft stage histo ps cci ckd_epi bmi / link=logit) nbiter =200 ;
fcs regpmm (ckd_epi = nac gesl leeft ses_cbs stage histo ps cci bmi) nbiter =200 ;
fcs regpmm (bmi = nac gesl leeft ses_cbs stage histo ps cci ckd_epi) nbiter =200 ;
fcs plots=trace(mean std);
run;
proc genmod data=cko.mi1_blazib;
class nac gesl ses_cbs stage histo ps cci;
model nac(event="1") = gesl leeft ses_cbs bmi stage histo ps cci ckd_epi bmi /dist=binomial link=logit;
by _imputation_;
ods output ParameterEstimates=cko.gm_fcs;
run;
proc mianalyze parms(classvar=level)=cko.gm_fcs;
class gesl ses_cbs stage histo ps cci;
modeleffects INTERCEPT gesl leeft ses_cbs stage histo ps cci ckd_epi bmi;
run;
... View more