BookmarkSubscribeRSS Feed
FanRu
Calcite | Level 5

Hello statisticians,

I have been using proc mianalyze for sometimes and i found that some variables showed statistical significance in each dataset after multiple imputation, but they didn't show statistical significance in  proc mianalyse. I tried to change the number of imputation but unfortunately found that it did not work. I would be very grateful if anyone could explain this phenomenon.

 

 

 

 

6 REPLIES 6
Ksharp
Super User

Calling @Rick_SAS 

FanRu
Calcite | Level 5
Thanks a lot!
Rick_SAS
SAS Super FREQ

Please post your code so that we can have a better chance of answering your questions.

FanRu
Calcite | Level 5

I am using proc traj to build a trajectory model and explore whether the covariates affect each trajectory group. Since proc traj does not have a by statement, I put together the parameter estimates for building the model using each impuation.

 

trajectory_imputation(data):

 

ID TIME    BMI    pulse   HDLC    LDLC    TC    TG     glucose    UA

A   2002    23.0    87.0      1.16       2.82     5.16   1.93    4.82        357

A   2003    22.8      .           1.18       .            .       1.80     .              .

A   2004     .         86 . 0       .          2.30      .           .        4.54       359

B   2003    24.0    86 .0     1.19       2.30      5.17   1.75    4.54       358

.....

 

/*mi*/

proc mi data=trajectory_imputation out=imputed
seed=2021 nimpute=20;
var BMI pulse HDLC LDLC TC TG glucose UA;
mcmc;
run;

/*traj*/

data ParameterEstimates;
set oe; /*include PARMS STDERR COV*/
if _TYPE_="PARMS" or _TYPE_="STDERR";
run;

/*mianalyze for one trajectory group*/

proc transpose data=ParameterEstimates out=ParameterEstimates (rename=(_NAME_=Parameter PARMS=Estimate STDERR=StdErr));
var INTERC01 LINEAR01 QUADRA01 CUBIC01  BMI001 PULSE001  HDLC001 LDLC001 TC001 TG001 GLUCOSE001 UA001;
by _imputation_;
id _TYPE_;
run;


ODS OUTPUT ParameterEstimates=RESULT;
proc mianalyze parms=ParameterEstimates;
modeleffects INTERC01 LINEAR01 QUADRA01 CUBIC01  BMI001 PULSE001  HDLC001 LDLC001 TC001 TG001 GLUCOSE001 UA001;
run;

 

I also found that this situation may occur if the parameter estimates and standard errors of the variables between each imputation are large.Unfortunately, I  can't figure out the reason for the large difference between each imputed datasets.Thank you very much for your help!

 

 

 

SAS_Rob
SAS Employee

Without seeing your code I would say that this is likely due to a large fraction of missing information (FMI).  You should expect an increase in the variance (and thus a reduction in significance), specifically the between imputation variance, when the FMI is high.  This section of the documentation will be helpful in that regard.

SAS Help Center: Multiple Imputation Efficiency

 

The other possible cause is that you have a bad imputation model (in the Proc MI step) or there is non-convergence in the MI models.

SAS Help Center: Checking Convergence in MCMC

 

If you can post your code and LOG (including the MI, modeling and MIANALYZE steps) then there might be something more concrete I can suggest.

FanRu
Calcite | Level 5
Thank you for your help! I also found that there is a large gap between the coefficient estimates and standard errors of some covariates between different datasets after imputation.Here below my code and LOG:
/*MI for longitudinal data*/
proc mi data=trajectory_imputation out=imputed seed=2021 nimpute=20;
var BMI pulse HDLC LDLC TC TG glucose UA;
mcmc timeplot(mean(BMI) mean(pulse) mean(HDLC) mean(LDLC) mean(TC) mean(TG) mean(glucose) mean(UA));
run;
WARNING: The TIMEPLOT option is ignored when ODS Graphics is enabled.
NOTE: The EM algorithm (MLE) converges in 12 iterations.
NOTE: The EM algorithm (posterior mode) converges in 1 iterations.
/*Trajectory modeling*/
proc traj data=imputed out=of. outplot=op outstat=os outest=oe;
id ID;
var target_variable0-target_variable12;
indep time0-time12;
model cnorm;
max 240;
ngroups 4;
order 3 5 4 5;
risk age sex;
tcov BMI0-BMI12 pulse0-pulse12 HDLC0-HDLC12 LDLC0-LDLC12 TC0-TC12 TG0-TG12 glucose0-glucose12 UA0-UA12;
run;

data ParameterEstimates;
set oe;/*include PARMS,STDERR and COV*/
if _TYPE_="PARMS" or _TYPE_="STDERR";
run;
/*mianalyze for one trajectory group*/
proc transpose data=ParameterEstimates out=ParameterEstimates(rename=(_NAME_=Parameter PARMS=Estimate STDERR=StdErr));
var INTERC01 LINEAR01 QUADRA01 CUBIC01 BMI001 PULSE001 HDLC001 LDLC001 TC001 TG001 GLUCOSE001 UA001;
by _imputation_;
id _TYPE_;
run;

ODS OUTPUT ParameterEstimates=RESULT;
proc mianalyze parms=ParameterEstimates.;
modeleffects INTERC01 LINEAR01 QUADRA01 CUBIC01 BMI001 PULSE001 HDLC001 LDLC001 TC001 TG001 GLUCOSE001 UA001;
run;














SAS Innovate 2025: Call for Content

Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!

Submit your idea!

What is ANOVA?

ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 6 replies
  • 1106 views
  • 0 likes
  • 4 in conversation