BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
xiangpang
Quartz | Level 8

Does anyone know why different "nimpute" would produce same data result in proc mi for EM algorithm imputation, but different mean / cov? 

 

thanks

data Fitness1;
   input Oxygen RunTime RunPulse @@;
   datalines;
44.609  11.37  178     45.313  10.07  185
54.297   8.65  156     59.571    .      .
49.874   9.22    .     44.811  11.63  176
  .     11.95  176          .  10.85    .
39.442  13.08  174     60.055   8.63  170
50.541    .      .     37.388  14.03  186
44.754  11.12  176     47.273    .      .
51.855  10.33  166     49.156   8.95  180
40.836  10.95  168     46.672  10.00    .
46.774  10.25    .     50.388  10.08  168
39.407  12.63  174     46.080  11.17  156
45.441   9.63  164       .      8.92    .
45.118  11.08    .     39.203  12.88  168
45.790  10.47  186     50.545   9.93  148
48.673   9.40  186     47.920  11.50  170
47.467  10.50  170
;
	proc mi data=Fitness1 seed=235 simple nimpute=16;
   em itprint out=MI_results3;
   var Oxygen RunTime RunPulse;
run;
	proc mi data=Fitness1 seed=235 simple nimpute=2;
   em itprint out=MI_results5;
   var Oxygen RunTime RunPulse;
run;
1 ACCEPTED SOLUTION

Accepted Solutions
SAS_Rob
SAS Employee

The OUT= data set on the EM statement produces a single imputed value based on the EM algorithm  This will be the same regardless of the seed or the number of imputations since the EM estimates will always be the same (see the EM (MLE) Parameter Estimates table in the output).  When you actually are apply multiple imputation via the MCMC algorithm, then you will see differences based on the seeds and the number of imputations.

 

.

View solution in original post

8 REPLIES 8
PGStats
Opal | Level 21

Not sure I get your question. If you want to see the imputed values, you should ask for:

 

	proc mi data=Fitness1 seed=235 simple nimpute=16 out=MI_values3;
   em itprint out=MI_results3;
   var Oxygen RunTime RunPulse;
run;
	proc mi data=Fitness1 seed=235 simple nimpute=2 out=MI_values5;
   em itprint out=MI_results5;
   var Oxygen RunTime RunPulse;
run;
PG
xiangpang
Quartz | Level 8

Thanks for your reply. But it is not answer to my question. If you compare the data from "MI_values3 " and "MI_values5", you will only find one same imputation value for each missing datapoint. No matter the "nimpute=2"  or "nimpute=16", this imputation value will not change. Unlike FCS or MCMC, you would have several different imputation value with the setting of nimpute. So I want to know why Proc MI EM only have one imputation value.

Do that mean nimpute is not useful here?

 

Thanks

 

 

PGStats
Opal | Level 21

My understanding is that the Mi_values3 and Mi_values5 random imputed values sequences are the same because they stem from the same seed (235). Specifying the seed is supposed to guarantee that you get exactly the same values.

PG
xiangpang
Quartz | Level 8

Thanks for your reply. So you suggest the nimpute is not useful here? But there is a difference of mean and cov between those 2. Why?

 

What I expected is that I would get several different imputation value same as nimpute setting. 

 

Good night

SAS_Rob
SAS Employee

The OUT= data set on the EM statement produces a single imputed value based on the EM algorithm  This will be the same regardless of the seed or the number of imputations since the EM estimates will always be the same (see the EM (MLE) Parameter Estimates table in the output).  When you actually are apply multiple imputation via the MCMC algorithm, then you will see differences based on the seeds and the number of imputations.

 

.

pengznuc
Fluorite | Level 6

Thanks for your reply. but why number of imputation could change cov in proc mi EM not MCMC? 

 

SAS_Rob
SAS Employee

I am not sure I am understanding what you are asking.  The EM covariance matrix remains the same either way.

 

proc mi data=Fitness1 seed=235 simple nimpute=16;
em itprint out=MI_results3 outem=out1;
var Oxygen RunTime RunPulse;
run;
proc mi data=Fitness1 seed=235 simple nimpute=2;
em itprint out=MI_results5 outem=out2;
var Oxygen RunTime RunPulse;
run;

proc print data=out1;
proc print data=out2;
run;

 

pengznuc
Fluorite | Level 6

thanks. it is strange. it was different last night, that is why I am confused. 

 

Thanks for the help from all the people. 

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

What is ANOVA?

ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 8 replies
  • 1809 views
  • 2 likes
  • 4 in conversation