Hi, i'm using Proc mi function for multiple imputation.
I've saved my imputed results with seeds(i.e. random generator).
However, in yesterday I found that proc mi produced the different results with same seeds. My results were all gone!
I'm in panic now, however, i'm facing the truth and trying to find out what was the problem.
My code was like below.
proc mi data=mywork.data nimpute=5 round=1 minimum=0 maximum=100 seed=1 out=mywork.data_1
var VAS_2 VAS_4 VAS_6 VAS_8 VAS_10 VAS_12;
run;
I didn't use seed number in other proc mi function.
I think that maybe the seed number was not enough long, or other reasons.
I really appreciate if you tell me the reason why my seeds were all changed.
Without your data I ran a test and cannot generate data that is different.
If you have sorted the data differently or changed something in the program before this, it's possible that's causing the change not PROC MI. Or something else you haven't shared with us could be the cause. You can try running the code below and the proc compare to see the difference. If your version of SAS is producing a difference, it's worth chatting with tech support.
*---------------------Data on Physical Fitness-------------------------*
| These measurements were made on men involved in a physical fitness |
| course at N.C. State University. Certain values have been set to |
| missing and the resulting data set has an arbitrary missing pattern. |
| Only selected variables of |
| Oxygen (intake rate, ml per kg body weight per minute), |
| Runtime (time to run 1.5 miles in minutes), |
| RunPulse (heart rate while running) are used. |
*----------------------------------------------------------------------*;
data Fitness1;
input Oxygen RunTime RunPulse @@;
datalines;
44.609 11.37 178 45.313 10.07 185
54.297 8.65 156 59.571 . .
49.874 9.22 . 44.811 11.63 176
. 11.95 176 . 10.85 .
39.442 13.08 174 60.055 8.63 170
50.541 . . 37.388 14.03 186
44.754 11.12 176 47.273 . .
51.855 10.33 166 49.156 8.95 180
40.836 10.95 168 46.672 10.00 .
46.774 10.25 . 50.388 10.08 168
39.407 12.63 174 46.080 11.17 156
45.441 9.63 164 . 8.92 .
45.118 11.08 . 39.203 12.88 168
45.790 10.47 186 50.545 9.93 148
48.673 9.40 186 47.920 11.50 170
47.467 10.50 170
;
proc mi data=Fitness1 seed=1 round=1 minimum=0 maximum=100 nimpute=10 out=outex1;
var Oxygen RunTime;
run;
proc mi data=Fitness1 seed=1 round=1 minimum=0 maximum=100 nimpute=10 out=outex2;
var Oxygen RunTime;
run;
proc compare data=outex1 compare=outex2;
run;
@km0927 wrote:
Hi, i'm using Proc mi function for multiple imputation.
I've saved my imputed results with seeds(i.e. random generator).
However, in yesterday I found that proc mi produced the different results with same seeds. My results were all gone!
I'm in panic now, however, i'm facing the truth and trying to find out what was the problem.
My code was like below.
proc mi data=mywork.data nimpute=5 round=1 minimum=0 maximum=100 seed=1 out=mywork.data_1
var VAS_2 VAS_4 VAS_6 VAS_8 VAS_10 VAS_12;
run;
I didn't use seed number in other proc mi function.
I think that maybe the seed number was not enough long, or other reasons.
I really appreciate if you tell me the reason why my seeds were all changed.
Without your data I ran a test and cannot generate data that is different.
If you have sorted the data differently or changed something in the program before this, it's possible that's causing the change not PROC MI. Or something else you haven't shared with us could be the cause. You can try running the code below and the proc compare to see the difference. If your version of SAS is producing a difference, it's worth chatting with tech support.
*---------------------Data on Physical Fitness-------------------------*
| These measurements were made on men involved in a physical fitness |
| course at N.C. State University. Certain values have been set to |
| missing and the resulting data set has an arbitrary missing pattern. |
| Only selected variables of |
| Oxygen (intake rate, ml per kg body weight per minute), |
| Runtime (time to run 1.5 miles in minutes), |
| RunPulse (heart rate while running) are used. |
*----------------------------------------------------------------------*;
data Fitness1;
input Oxygen RunTime RunPulse @@;
datalines;
44.609 11.37 178 45.313 10.07 185
54.297 8.65 156 59.571 . .
49.874 9.22 . 44.811 11.63 176
. 11.95 176 . 10.85 .
39.442 13.08 174 60.055 8.63 170
50.541 . . 37.388 14.03 186
44.754 11.12 176 47.273 . .
51.855 10.33 166 49.156 8.95 180
40.836 10.95 168 46.672 10.00 .
46.774 10.25 . 50.388 10.08 168
39.407 12.63 174 46.080 11.17 156
45.441 9.63 164 . 8.92 .
45.118 11.08 . 39.203 12.88 168
45.790 10.47 186 50.545 9.93 148
48.673 9.40 186 47.920 11.50 170
47.467 10.50 170
;
proc mi data=Fitness1 seed=1 round=1 minimum=0 maximum=100 nimpute=10 out=outex1;
var Oxygen RunTime;
run;
proc mi data=Fitness1 seed=1 round=1 minimum=0 maximum=100 nimpute=10 out=outex2;
var Oxygen RunTime;
run;
proc compare data=outex1 compare=outex2;
run;
@km0927 wrote:
Hi, i'm using Proc mi function for multiple imputation.
I've saved my imputed results with seeds(i.e. random generator).
However, in yesterday I found that proc mi produced the different results with same seeds. My results were all gone!
I'm in panic now, however, i'm facing the truth and trying to find out what was the problem.
My code was like below.
proc mi data=mywork.data nimpute=5 round=1 minimum=0 maximum=100 seed=1 out=mywork.data_1
var VAS_2 VAS_4 VAS_6 VAS_8 VAS_10 VAS_12;
run;
I didn't use seed number in other proc mi function.
I think that maybe the seed number was not enough long, or other reasons.
I really appreciate if you tell me the reason why my seeds were all changed.
> I didn't use seed number in other proc mi function.
Can you explain what you mean by this statement? Do you mean you didn't use the SEED= option? Or that you used a different seed value?
Is it possible that you specified an invalid seed? To be reproducible, the seed value has to be a positive integer. The valid range (for the default MTHYBRID algorithm) is [1, 4294967295].
It means that I didn't write 'same' seed number in other proc mi code
For example, if i used the seed '1' in MI of VAS like
proc mi data=mywork.data nimpute=5 round=1 minimum=0 maximum=100 seed=1 out=mywork.data_1
var VAS_2 VAS_4 VAS_6 VAS_8 VAS_10 VAS_12;
run;
I didn't use seed number '1' in other proc mi code, unlike below.
proc mi data=mywork.data nimpute=5 round=1 minimum=0 maximum=100 seed=1 out=mywork.data_1
var NRS_2 NRS_4 NRS_6 NRS_8 NRS_10 NRS_12;
run;
thanks.
Thank you for your answer. So that we can help help you, please post the code that you used and explain what this sentence means:
"I found that proc mi produced the different results with same seeds. My results were all gone!
@km0927 wrote:
It means that I didn't write 'same' seed number in other proc mi code
For example, if i used the seed '1' in MI of VAS like
proc mi data=mywork.data nimpute=5 round=1 minimum=0 maximum=100 seed=1 out=mywork.data_1
var VAS_2 VAS_4 VAS_6 VAS_8 VAS_10 VAS_12;
run;
I didn't use seed number '1' in other proc mi code, unlike below.
proc mi data=mywork.data nimpute=5 round=1 minimum=0 maximum=100 seed=1 out=mywork.data_1
var NRS_2 NRS_4 NRS_6 NRS_8 NRS_10 NRS_12;
run;
thanks.
In that case you cannot expect your results to be identical. You need to use the same seed to get identical results.
If you accidentally don't recall the seed or changed it, unless you have the log saved you're out of luck. Unless you want to try a whole bunch of random seeds and find the exact one used.
@Reeza , note that the OP's variables are different in each PROC call. Thus the programs can never generate "the same results."
@km0927 Your examples use different variables (but the same seed), so, yes, you will get different results. The results depend on the variables, their values, and even their order. Compare these two outputs, which are different because the order of variables is different:
data A;
call streaminit(12345);
do i = 1 to 100;
x1 = rand("Uniform", 0, 100); if rand("Bern", 0.1) then x1=.;
x2 = rand("Uniform", 0, 100); if rand("Bern", 0.2) then x2=.;
y1 = rand("Uniform", 0, 100); if rand("Bern", 0.3) then y1=.;
y2 = rand("Uniform", 0, 100); if rand("Bern", 0.4) then y2=.;
output;
end;
run;
proc mi data=A nimpute=5 round=1 minimum=0 maximum=100 seed=1 out=data_1;
var x1 x2 y1 y2;
run;
proc mi data=A nimpute=5 round=1 minimum=0 maximum=100 seed=1 out=data_2;
var y1 y2 x1 x2;
run;
proc compare base=data_1 compare=data_2 brief novalues;
run;
Thank you for all your answer.
I'm sorry for using misleading expression. I thought that 'seed' is similar to game save file. In game, if we save the play over previously saved file, we cannot regenerate that previous play. So, if I use the same seed number in another PROC mi function with different variables or condition(in my example, VAS and NRS), I though that I cannot regenerate the previously generated data with same seed number. But thanks to Reeza, I used 'proc compare' and I found that it isn't true.
I'm not sure that I changed the seed number or I used different variables order. It doens't seems that the generated results with seed number get deleted after closing the SAS program, right?
Correct: The SEED= option has nothing to do with saving or deleting or overwriting files.
The SEED= option initializes a random number stream. A seed value specifies a particular stream from a set of possible random number streams. When you specify a seed, SAS generates the same set of pseudorandom numbers every time you run the program. See the article
Random number streams in SAS: How do they work?
Thanks, though I still haven't found why this situation occurred, this topic and all of your answers helped me a lot.
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.
Find more tutorials on the SAS Users YouTube channel.