Does anyone know why different "nimpute" would produce same data result in proc mi for EM algorithm imputation, but different mean / cov?
thanks
data Fitness1;
input Oxygen RunTime RunPulse @@;
datalines;
44.609 11.37 178 45.313 10.07 185
54.297 8.65 156 59.571 . .
49.874 9.22 . 44.811 11.63 176
. 11.95 176 . 10.85 .
39.442 13.08 174 60.055 8.63 170
50.541 . . 37.388 14.03 186
44.754 11.12 176 47.273 . .
51.855 10.33 166 49.156 8.95 180
40.836 10.95 168 46.672 10.00 .
46.774 10.25 . 50.388 10.08 168
39.407 12.63 174 46.080 11.17 156
45.441 9.63 164 . 8.92 .
45.118 11.08 . 39.203 12.88 168
45.790 10.47 186 50.545 9.93 148
48.673 9.40 186 47.920 11.50 170
47.467 10.50 170
;
proc mi data=Fitness1 seed=235 simple nimpute=16;
em itprint out=MI_results3;
var Oxygen RunTime RunPulse;
run;
proc mi data=Fitness1 seed=235 simple nimpute=2;
em itprint out=MI_results5;
var Oxygen RunTime RunPulse;
run;
The OUT= data set on the EM statement produces a single imputed value based on the EM algorithm This will be the same regardless of the seed or the number of imputations since the EM estimates will always be the same (see the EM (MLE) Parameter Estimates table in the output). When you actually are apply multiple imputation via the MCMC algorithm, then you will see differences based on the seeds and the number of imputations.
.
Not sure I get your question. If you want to see the imputed values, you should ask for:
proc mi data=Fitness1 seed=235 simple nimpute=16 out=MI_values3;
em itprint out=MI_results3;
var Oxygen RunTime RunPulse;
run;
proc mi data=Fitness1 seed=235 simple nimpute=2 out=MI_values5;
em itprint out=MI_results5;
var Oxygen RunTime RunPulse;
run;
Thanks for your reply. But it is not answer to my question. If you compare the data from "MI_values3 " and "MI_values5", you will only find one same imputation value for each missing datapoint. No matter the "nimpute=2" or "nimpute=16", this imputation value will not change. Unlike FCS or MCMC, you would have several different imputation value with the setting of nimpute. So I want to know why Proc MI EM only have one imputation value.
Do that mean nimpute is not useful here?
Thanks
My understanding is that the Mi_values3 and Mi_values5 random imputed values sequences are the same because they stem from the same seed (235). Specifying the seed is supposed to guarantee that you get exactly the same values.
Thanks for your reply. So you suggest the nimpute is not useful here? But there is a difference of mean and cov between those 2. Why?
What I expected is that I would get several different imputation value same as nimpute setting.
Good night
The OUT= data set on the EM statement produces a single imputed value based on the EM algorithm This will be the same regardless of the seed or the number of imputations since the EM estimates will always be the same (see the EM (MLE) Parameter Estimates table in the output). When you actually are apply multiple imputation via the MCMC algorithm, then you will see differences based on the seeds and the number of imputations.
.
Thanks for your reply. but why number of imputation could change cov in proc mi EM not MCMC?
I am not sure I am understanding what you are asking. The EM covariance matrix remains the same either way.
proc mi data=Fitness1 seed=235 simple nimpute=16;
em itprint out=MI_results3 outem=out1;
var Oxygen RunTime RunPulse;
run;
proc mi data=Fitness1 seed=235 simple nimpute=2;
em itprint out=MI_results5 outem=out2;
var Oxygen RunTime RunPulse;
run;
proc print data=out1;
proc print data=out2;
run;
thanks. it is strange. it was different last night, that is why I am confused.
Thanks for the help from all the people.
Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.
Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.
ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.
Find more tutorials on the SAS Users YouTube channel.