BookmarkSubscribeRSS Feed
elg
Fluorite | Level 6 elg
Fluorite | Level 6

Hello everyone, I'm currently unable to share data but am running into an issue when I try to replicate results with two different datasets which should be identical when it comes to getting the same imputed values.


Here are the things I've confirmed to match:

  • The number of observations are the same
  • The values of the observations are the same
  • The same seed is being used in each PROC MI
  • The datasets are being sorted by the same variables prior to running the PROC MI
  • The number of nimpute are the same

I've compared all of the "Missing Data Patterns" table to fully match with all values, but the imputed values differ.
Is there any other things to check/change to make sure that might change the values that are being imputed?

 

Thanks for your help!!

5 REPLIES 5
SAS_Rob
SAS Employee

It isn't clear what you mean when you say that the data sets are different, but the observations are the same.  Can you elaborate what exactly is different and what exactly is the same in the two cases? Also, could you post the two sets of codes you are using?

elg
Fluorite | Level 6 elg
Fluorite | Level 6

Hi Rob,


Sorry for being vague, but they are independently derived datasets from two different people. One uses ADAM standards and another does not.


The code isn't elaborate but the example is like this:

proc sort data=ds;

by trt;

run;

proc mi data=ds seed=123 nimpute=100 output=out;
by trt;

var baseline week6 week12 week18 week24;

run;

I did try changing variable names (example base vs baseline) and did not see a difference.

The only differences are the imputed values which makes me think there may be something else being sorted or considered. I also narrowed the datasets down to only the variables mentioned above and don't see imputations being different either.

SAS_Rob
SAS Employee

To be clear then, the data sets are identical in all the variables--trt,baseline, week6, week12, week18, week24 and in the same order with missing values on the same observations?

If you can add the SIMPLE option to the MI statement and post its output along with the Missing Data Patterns output, then that might be helpful.

elg
Fluorite | Level 6 elg
Fluorite | Level 6

Hi Rob,

 

I really dug into this and I am able to replicate now. These were things I updated which I tried separately and didn't see differences, but found things to align when I did the following:

* I created datasets that only contained the variables of interest along with a subject ID
* I sorted by all variables in the model (trt,baseline, week6, week12, week18, week24) along with a subject ID first
* There differences of decimal places to the 12th place, so I rounded to .000001 for all values.
* I renamed all variables to be exactly the same

Would all of these have an effect on the MI values? I tried many of these steps previously and did not see a difference, I guess they all matter together.

 

 

Ksharp
Super User

It looks like  the order of record has changed between these two datasets.

Here is an example.

 

data ds;
call streaminit(123);
do n=1 to 100;
 baseline=rand('normal');
 month1=rand('normal');
 month2=rand('normal');
 if n in (10:20 90 98:100) then call missing(baseline);
 if n in (1:4 9 28:30) then call missing(month1);
 if n in (40:50 80:85) then call missing(month2);
 output;
end;
run;

proc sort data=ds out=ds2;
by descending n;
run;

proc mi data=ds seed=123 nimpute=100 out=out1;
var baseline month1 month2;
run;
proc mi data=ds2 seed=123 nimpute=100 out=out2;
var baseline month1 month2;
run;
proc sort data=out2 out=out22;
by _Imputation_ n;
run;

 

 

 

Ksharp_0-1742953006787.png

 

hackathon24-white-horiz.png

The 2025 SAS Hackathon has begun!

It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.

Latest Updates

What is ANOVA?

ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 5 replies
  • 1064 views
  • 1 like
  • 3 in conversation