BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
km0927
Obsidian | Level 7

Hi, i'm using Proc mi function for multiple imputation.

 

I've saved my imputed results with seeds(i.e. random generator).

 

However, in yesterday I found that proc mi produced the different results with same seeds. My results were all gone!

 

I'm in panic now, however, i'm facing the truth and trying to find out what was the problem.

 

My code was like below.


proc mi data=mywork.data nimpute=5 round=1 minimum=0 maximum=100 seed=1 out=mywork.data_1
var VAS_2 VAS_4 VAS_6 VAS_8 VAS_10 VAS_12;
run;

 

I didn't use seed number in other proc mi function.

 

I think that maybe the seed number was not enough long, or other reasons.

 

I really appreciate if you tell me the reason why my seeds were all changed.

1 ACCEPTED SOLUTION

Accepted Solutions
Reeza
Super User

Without your data I ran a test and cannot generate data that is different. 

 

If you have sorted the data differently or changed something in the program before this, it's possible that's causing the change not PROC MI. Or something else you haven't shared with us could be the cause. You can try running the code below and the proc compare to see the difference. If your version of SAS is producing a difference, it's worth chatting with tech support.

 

*---------------------Data on Physical Fitness-------------------------*
| These measurements were made on men involved in a physical fitness   |
| course at N.C. State University. Certain values have been set to     |
| missing and the resulting data set has an arbitrary missing pattern. |
| Only selected variables of                                           |
| Oxygen (intake rate, ml per kg body weight per minute),              |
| Runtime (time to run 1.5 miles in minutes),                          |
| RunPulse (heart rate while running) are used.                        |
*----------------------------------------------------------------------*;
data Fitness1;
   input Oxygen RunTime RunPulse @@;
   datalines;
44.609  11.37  178     45.313  10.07  185
54.297   8.65  156     59.571    .      .
49.874   9.22    .     44.811  11.63  176
  .     11.95  176          .  10.85    .
39.442  13.08  174     60.055   8.63  170
50.541    .      .     37.388  14.03  186
44.754  11.12  176     47.273    .      .
51.855  10.33  166     49.156   8.95  180
40.836  10.95  168     46.672  10.00    .
46.774  10.25    .     50.388  10.08  168
39.407  12.63  174     46.080  11.17  156
45.441   9.63  164       .      8.92    .
45.118  11.08    .     39.203  12.88  168
45.790  10.47  186     50.545   9.93  148
48.673   9.40  186     47.920  11.50  170
47.467  10.50  170
;


proc mi data=Fitness1 seed=1 round=1 minimum=0 maximum=100 nimpute=10 out=outex1;
   var Oxygen RunTime;
run;

proc mi data=Fitness1 seed=1 round=1 minimum=0 maximum=100 nimpute=10 out=outex2;
   var Oxygen RunTime;
run;


proc compare data=outex1 compare=outex2;
run;

 

 


@km0927 wrote:

Hi, i'm using Proc mi function for multiple imputation.

 

I've saved my imputed results with seeds(i.e. random generator).

 

However, in yesterday I found that proc mi produced the different results with same seeds. My results were all gone!

 

I'm in panic now, however, i'm facing the truth and trying to find out what was the problem.

 

My code was like below.


proc mi data=mywork.data nimpute=5 round=1 minimum=0 maximum=100 seed=1 out=mywork.data_1
var VAS_2 VAS_4 VAS_6 VAS_8 VAS_10 VAS_12;
run;

 

I didn't use seed number in other proc mi function.

 

I think that maybe the seed number was not enough long, or other reasons.

 

I really appreciate if you tell me the reason why my seeds were all changed.



 

View solution in original post

9 REPLIES 9
Reeza
Super User

Without your data I ran a test and cannot generate data that is different. 

 

If you have sorted the data differently or changed something in the program before this, it's possible that's causing the change not PROC MI. Or something else you haven't shared with us could be the cause. You can try running the code below and the proc compare to see the difference. If your version of SAS is producing a difference, it's worth chatting with tech support.

 

*---------------------Data on Physical Fitness-------------------------*
| These measurements were made on men involved in a physical fitness   |
| course at N.C. State University. Certain values have been set to     |
| missing and the resulting data set has an arbitrary missing pattern. |
| Only selected variables of                                           |
| Oxygen (intake rate, ml per kg body weight per minute),              |
| Runtime (time to run 1.5 miles in minutes),                          |
| RunPulse (heart rate while running) are used.                        |
*----------------------------------------------------------------------*;
data Fitness1;
   input Oxygen RunTime RunPulse @@;
   datalines;
44.609  11.37  178     45.313  10.07  185
54.297   8.65  156     59.571    .      .
49.874   9.22    .     44.811  11.63  176
  .     11.95  176          .  10.85    .
39.442  13.08  174     60.055   8.63  170
50.541    .      .     37.388  14.03  186
44.754  11.12  176     47.273    .      .
51.855  10.33  166     49.156   8.95  180
40.836  10.95  168     46.672  10.00    .
46.774  10.25    .     50.388  10.08  168
39.407  12.63  174     46.080  11.17  156
45.441   9.63  164       .      8.92    .
45.118  11.08    .     39.203  12.88  168
45.790  10.47  186     50.545   9.93  148
48.673   9.40  186     47.920  11.50  170
47.467  10.50  170
;


proc mi data=Fitness1 seed=1 round=1 minimum=0 maximum=100 nimpute=10 out=outex1;
   var Oxygen RunTime;
run;

proc mi data=Fitness1 seed=1 round=1 minimum=0 maximum=100 nimpute=10 out=outex2;
   var Oxygen RunTime;
run;


proc compare data=outex1 compare=outex2;
run;

 

 


@km0927 wrote:

Hi, i'm using Proc mi function for multiple imputation.

 

I've saved my imputed results with seeds(i.e. random generator).

 

However, in yesterday I found that proc mi produced the different results with same seeds. My results were all gone!

 

I'm in panic now, however, i'm facing the truth and trying to find out what was the problem.

 

My code was like below.


proc mi data=mywork.data nimpute=5 round=1 minimum=0 maximum=100 seed=1 out=mywork.data_1
var VAS_2 VAS_4 VAS_6 VAS_8 VAS_10 VAS_12;
run;

 

I didn't use seed number in other proc mi function.

 

I think that maybe the seed number was not enough long, or other reasons.

 

I really appreciate if you tell me the reason why my seeds were all changed.



 

Rick_SAS
SAS Super FREQ

> I didn't use seed number in other proc mi function.

 

Can you explain what you mean by this statement? Do you mean you didn't use the SEED= option? Or that you used a different seed value? 

 

Is it possible that you specified an invalid seed? To be reproducible, the seed value has to be a positive integer. The valid range (for the default MTHYBRID algorithm) is  [1, 4294967295].

km0927
Obsidian | Level 7

It means that I didn't write 'same' seed number in other proc mi code

 

For example, if i used the seed '1' in MI of VAS like

 

proc mi data=mywork.data nimpute=5 round=1 minimum=0 maximum=100 seed=1 out=mywork.data_1
var VAS_2 VAS_4 VAS_6 VAS_8 VAS_10 VAS_12;
run;

 

I didn't use seed number '1' in other proc mi code, unlike below.

 

proc mi data=mywork.data nimpute=5 round=1 minimum=0 maximum=100 seed=1 out=mywork.data_1
var NRS_2 NRS_4 NRS_6 NRS_8 NRS_10 NRS_12;
run;

 

thanks.

Rick_SAS
SAS Super FREQ

Thank you for your answer. So that we can help help you, please post the code that you used and explain what this sentence means:

"I found that proc mi produced the different results with same seeds. My results were all gone!

Reeza
Super User

@km0927 wrote:

It means that I didn't write 'same' seed number in other proc mi code

 

For example, if i used the seed '1' in MI of VAS like

 

proc mi data=mywork.data nimpute=5 round=1 minimum=0 maximum=100 seed=1 out=mywork.data_1
var VAS_2 VAS_4 VAS_6 VAS_8 VAS_10 VAS_12;
run;

 

I didn't use seed number '1' in other proc mi code, unlike below.

 

proc mi data=mywork.data nimpute=5 round=1 minimum=0 maximum=100 seed=1 out=mywork.data_1
var NRS_2 NRS_4 NRS_6 NRS_8 NRS_10 NRS_12;
run;

 

thanks.


In that case you cannot expect your results to be identical. You need to use the same seed to get identical results. 

 

If you accidentally don't recall the seed or changed it, unless you have the log saved you're out of luck. Unless you want to try a whole bunch of random seeds and find the exact one used.

Rick_SAS
SAS Super FREQ

@Reeza , note that the OP's variables are different in each PROC call. Thus the programs can never generate "the same results."

 

@km0927 Your examples use different variables (but the same seed), so, yes, you will get different results. The results depend on the variables, their values, and even their order. Compare these two outputs, which are different because the order of variables is different:

data A;
call streaminit(12345);
do i = 1 to 100;
   x1 = rand("Uniform", 0, 100); if rand("Bern", 0.1) then x1=.;
   x2 = rand("Uniform", 0, 100); if rand("Bern", 0.2) then x2=.;
   y1 = rand("Uniform", 0, 100); if rand("Bern", 0.3) then y1=.;
   y2 = rand("Uniform", 0, 100); if rand("Bern", 0.4) then y2=.;
output;
end;
run;

proc mi data=A nimpute=5 round=1 minimum=0 maximum=100 seed=1 out=data_1;
var x1 x2 y1 y2;
run;

proc mi data=A nimpute=5 round=1 minimum=0 maximum=100 seed=1 out=data_2;
var y1 y2 x1 x2;
run;

proc compare base=data_1 compare=data_2 brief novalues;
run;
km0927
Obsidian | Level 7

Thank you for all your answer.

 

I'm sorry for using misleading expression. I thought that 'seed' is similar to game save file. In game, if we save the play over previously saved file, we cannot regenerate that previous play. So, if I use the same seed number in another PROC mi function with different variables or condition(in my example, VAS and NRS), I though that I cannot regenerate the previously generated data with same seed number. But thanks to Reeza, I used 'proc compare' and I found that it isn't true.

 

I'm not sure that I changed the seed number or I used different variables order. It doens't seems that the generated results with seed number get deleted after closing the SAS program, right?

Rick_SAS
SAS Super FREQ

Correct: The SEED= option has nothing to do with saving or deleting or overwriting files.

 

The SEED= option initializes a random number stream. A seed value specifies a particular stream from a set of possible random number streams. When you specify a seed, SAS generates the same set of pseudorandom numbers every time you run the program. See the article

Random number streams in SAS: How do they work?

 

 

km0927
Obsidian | Level 7

Thanks, though I still haven't found why this situation occurred, this topic and all of your answers helped me a lot.

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

What is ANOVA?

ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 9 replies
  • 2791 views
  • 4 likes
  • 3 in conversation