Solved: Proc mi produced different results with same seeds

km0927 · Posted 04-15-2019 08:48 PM

Hi, i'm using Proc mi function for multiple imputation.

I've saved my imputed results with seeds(i.e. random generator).

However, in yesterday I found that proc mi produced the different results with same seeds. My results were all gone!

I'm in panic now, however, i'm facing the truth and trying to find out what was the problem.

My code was like below.

proc mi data=mywork.data nimpute=5 round=1 minimum=0 maximum=100 seed=1 out=mywork.data_1
var VAS_2 VAS_4 VAS_6 VAS_8 VAS_10 VAS_12;
run;

I didn't use seed number in other proc mi function.

I think that maybe the seed number was not enough long, or other reasons.

I really appreciate if you tell me the reason why my seeds were all changed.

Reeza · Posted 04-15-2019 09:20 PM

Without your data I ran a test and cannot generate data that is different.

If you have sorted the data differently or changed something in the program before this, it's possible that's causing the change not PROC MI. Or something else you haven't shared with us could be the cause. You can try running the code below and the proc compare to see the difference. If your version of SAS is producing a difference, it's worth chatting with tech support.

*---------------------Data on Physical Fitness-------------------------*
| These measurements were made on men involved in a physical fitness   |
| course at N.C. State University. Certain values have been set to     |
| missing and the resulting data set has an arbitrary missing pattern. |
| Only selected variables of                                           |
| Oxygen (intake rate, ml per kg body weight per minute),              |
| Runtime (time to run 1.5 miles in minutes),                          |
| RunPulse (heart rate while running) are used.                        |
*----------------------------------------------------------------------*;
data Fitness1;
   input Oxygen RunTime RunPulse @@;
   datalines;
44.609  11.37  178     45.313  10.07  185
54.297   8.65  156     59.571    .      .
49.874   9.22    .     44.811  11.63  176
  .     11.95  176          .  10.85    .
39.442  13.08  174     60.055   8.63  170
50.541    .      .     37.388  14.03  186
44.754  11.12  176     47.273    .      .
51.855  10.33  166     49.156   8.95  180
40.836  10.95  168     46.672  10.00    .
46.774  10.25    .     50.388  10.08  168
39.407  12.63  174     46.080  11.17  156
45.441   9.63  164       .      8.92    .
45.118  11.08    .     39.203  12.88  168
45.790  10.47  186     50.545   9.93  148
48.673   9.40  186     47.920  11.50  170
47.467  10.50  170
;


proc mi data=Fitness1 seed=1 round=1 minimum=0 maximum=100 nimpute=10 out=outex1;
   var Oxygen RunTime;
run;

proc mi data=Fitness1 seed=1 round=1 minimum=0 maximum=100 nimpute=10 out=outex2;
   var Oxygen RunTime;
run;


proc compare data=outex1 compare=outex2;
run;

@km0927 wrote:

Hi, i'm using Proc mi function for multiple imputation.

I've saved my imputed results with seeds(i.e. random generator).

However, in yesterday I found that proc mi produced the different results with same seeds. My results were all gone!

I'm in panic now, however, i'm facing the truth and trying to find out what was the problem.

My code was like below.

proc mi data=mywork.data nimpute=5 round=1 minimum=0 maximum=100 seed=1 out=mywork.data_1
var VAS_2 VAS_4 VAS_6 VAS_8 VAS_10 VAS_12;
run;

I didn't use seed number in other proc mi function.

I think that maybe the seed number was not enough long, or other reasons.

I really appreciate if you tell me the reason why my seeds were all changed.

View solution in original post

Reeza · Posted 04-15-2019 09:20 PM

Without your data I ran a test and cannot generate data that is different.

If you have sorted the data differently or changed something in the program before this, it's possible that's causing the change not PROC MI. Or something else you haven't shared with us could be the cause. You can try running the code below and the proc compare to see the difference. If your version of SAS is producing a difference, it's worth chatting with tech support.

*---------------------Data on Physical Fitness-------------------------*
| These measurements were made on men involved in a physical fitness   |
| course at N.C. State University. Certain values have been set to     |
| missing and the resulting data set has an arbitrary missing pattern. |
| Only selected variables of                                           |
| Oxygen (intake rate, ml per kg body weight per minute),              |
| Runtime (time to run 1.5 miles in minutes),                          |
| RunPulse (heart rate while running) are used.                        |
*----------------------------------------------------------------------*;
data Fitness1;
   input Oxygen RunTime RunPulse @@;
   datalines;
44.609  11.37  178     45.313  10.07  185
54.297   8.65  156     59.571    .      .
49.874   9.22    .     44.811  11.63  176
  .     11.95  176          .  10.85    .
39.442  13.08  174     60.055   8.63  170
50.541    .      .     37.388  14.03  186
44.754  11.12  176     47.273    .      .
51.855  10.33  166     49.156   8.95  180
40.836  10.95  168     46.672  10.00    .
46.774  10.25    .     50.388  10.08  168
39.407  12.63  174     46.080  11.17  156
45.441   9.63  164       .      8.92    .
45.118  11.08    .     39.203  12.88  168
45.790  10.47  186     50.545   9.93  148
48.673   9.40  186     47.920  11.50  170
47.467  10.50  170
;


proc mi data=Fitness1 seed=1 round=1 minimum=0 maximum=100 nimpute=10 out=outex1;
   var Oxygen RunTime;
run;

proc mi data=Fitness1 seed=1 round=1 minimum=0 maximum=100 nimpute=10 out=outex2;
   var Oxygen RunTime;
run;


proc compare data=outex1 compare=outex2;
run;

@km0927 wrote:

Hi, i'm using Proc mi function for multiple imputation.

I've saved my imputed results with seeds(i.e. random generator).

However, in yesterday I found that proc mi produced the different results with same seeds. My results were all gone!

I'm in panic now, however, i'm facing the truth and trying to find out what was the problem.

My code was like below.

proc mi data=mywork.data nimpute=5 round=1 minimum=0 maximum=100 seed=1 out=mywork.data_1
var VAS_2 VAS_4 VAS_6 VAS_8 VAS_10 VAS_12;
run;

I didn't use seed number in other proc mi function.

I think that maybe the seed number was not enough long, or other reasons.

I really appreciate if you tell me the reason why my seeds were all changed.

Rick_SAS · Posted 04-16-2019 07:59 AM

> I didn't use seed number in other proc mi function.

Can you explain what you mean by this statement? Do you mean you didn't use the SEED= option? Or that you used a different seed value?

Is it possible that you specified an invalid seed? To be reproducible, the seed value has to be a positive integer. The valid range (for the default MTHYBRID algorithm) is [1, 4294967295].

km0927 · Posted 04-17-2019 04:26 AM

It means that I didn't write 'same' seed number in other proc mi code

For example, if i used the seed '1' in MI of VAS like

proc mi data=mywork.data nimpute=5 round=1 minimum=0 maximum=100 seed=1 out=mywork.data_1
var VAS_2 VAS_4 VAS_6 VAS_8 VAS_10 VAS_12;
run;

I didn't use seed number '1' in other proc mi code, unlike below.

proc mi data=mywork.data nimpute=5 round=1 minimum=0 maximum=100 seed=1 out=mywork.data_1
var NRS_2 NRS_4 NRS_6 NRS_8 NRS_10 NRS_12;
run;

thanks.

Rick_SAS · Posted 04-17-2019 07:41 AM

Thank you for your answer. So that we can help help you, please post the code that you used and explain what this sentence means:

"I found that proc mi produced the different results with same seeds. My results were all gone!

Reeza · Posted 04-17-2019 11:52 AM

@km0927 wrote:

It means that I didn't write 'same' seed number in other proc mi code

For example, if i used the seed '1' in MI of VAS like

proc mi data=mywork.data nimpute=5 round=1 minimum=0 maximum=100 seed=1 out=mywork.data_1
var VAS_2 VAS_4 VAS_6 VAS_8 VAS_10 VAS_12;
run;

I didn't use seed number '1' in other proc mi code, unlike below.

proc mi data=mywork.data nimpute=5 round=1 minimum=0 maximum=100 seed=1 out=mywork.data_1
var NRS_2 NRS_4 NRS_6 NRS_8 NRS_10 NRS_12;
run;

thanks.

In that case you cannot expect your results to be identical. You need to use the same seed to get identical results.

If you accidentally don't recall the seed or changed it, unless you have the log saved you're out of luck. Unless you want to try a whole bunch of random seeds and find the exact one used.

Rick_SAS · Posted 04-17-2019 02:24 PM

@Reeza , note that the OP's variables are different in each PROC call. Thus the programs can never generate "the same results."

@km0927 Your examples use different variables (but the same seed), so, yes, you will get different results. The results depend on the variables, their values, and even their order. Compare these two outputs, which are different because the order of variables is different:

data A;
call streaminit(12345);
do i = 1 to 100;
   x1 = rand("Uniform", 0, 100); if rand("Bern", 0.1) then x1=.;
   x2 = rand("Uniform", 0, 100); if rand("Bern", 0.2) then x2=.;
   y1 = rand("Uniform", 0, 100); if rand("Bern", 0.3) then y1=.;
   y2 = rand("Uniform", 0, 100); if rand("Bern", 0.4) then y2=.;
output;
end;
run;

proc mi data=A nimpute=5 round=1 minimum=0 maximum=100 seed=1 out=data_1;
var x1 x2 y1 y2;
run;

proc mi data=A nimpute=5 round=1 minimum=0 maximum=100 seed=1 out=data_2;
var y1 y2 x1 x2;
run;

proc compare base=data_1 compare=data_2 brief novalues;
run;

km0927 · Posted 04-21-2019 02:55 AM

Thank you for all your answer.

I'm sorry for using misleading expression. I thought that 'seed' is similar to game save file. In game, if we save the play over previously saved file, we cannot regenerate that previous play. So, if I use the same seed number in another PROC mi function with different variables or condition(in my example, VAS and NRS), I though that I cannot regenerate the previously generated data with same seed number. But thanks to Reeza, I used 'proc compare' and I found that it isn't true.

I'm not sure that I changed the seed number or I used different variables order. It doens't seems that the generated results with seed number get deleted after closing the SAS program, right?

Rick_SAS · Posted 04-21-2019 07:06 AM

Correct: The SEED= option has nothing to do with saving or deleting or overwriting files.

The SEED= option initializes a random number stream. A seed value specifies a particular stream from a set of possible random number streams. When you specify a seed, SAS generates the same set of pseudorandom numbers every time you run the program. See the article

Random number streams in SAS: How do they work?

km0927 · Posted 04-26-2019 02:02 AM

Thanks, though I still haven't found why this situation occurred, this topic and all of your answers helped me a lot.

Proc mi produced different results with same seeds

Re: Proc mi produced different results with same seeds

Re: Proc mi produced different results with same seeds

Re: Proc mi produced different results with same seeds

Re: Proc mi produced different results with same seeds

Re: Proc mi produced different results with same seeds

Re: Proc mi produced different results with same seeds

Re: Proc mi produced different results with same seeds

Re: Proc mi produced different results with same seeds

Re: Proc mi produced different results with same seeds

Re: Proc mi produced different results with same seeds

SAS Innovate 2025: Call for Content