Hi All,
I have a query, can someone help me?
While creating a Randomization schedule using SAS PROC PLAN if we are allowing SAS to utilize dynamic Seed value, is there any way we can get to know what is the Seed value used by SAS for creating the randomization schedule?
Part of SAS code:
Data _null_;
call symputx ('Seedval', tranwrd(put(ranuni(0), best9.), '0.' , ' '));
run;
proc plan seed=&Seedval. ;
factors BlockID= 50 ordered Treatment=4 random /noprint ;
output out=RandList BlockID nvals=(1 to 50)
Treatment cvals=('Arm A' 'Arm A' 'Arm B' 'Arm B' );
quit;
in my code I missed to output the macro value &Seedval.
Thanks in advance,
Sukumar
Hello, Thank you for you quick reply. I know the TIME when I executed the code. Is there any chance to get it with the system time?
-Regards,
Sukumar
Hello @SukumarBalusamy and welcome to the SAS Support Communities!
PROC PLAN writes the initial seed (and the final seed) to the log. In addition, you can retrieve these two seeds from automatic macro variables SYSRANDOM and SYSRANEND, respectively, as shown in the log of your code (plus two %PUT statements) below:
568 Data _null_; 569 call symputx ('Seedval', tranwrd(put(ranuni(0), best9.), '0.' , ' ')); 570 run; NOTE: DATA statement used (Total process time): real time 0.05 seconds cpu time 0.06 seconds 571 572 proc plan seed=&Seedval. ; NOTE: At the start of processing, random number seed=9261594. 573 factors BlockID= 50 ordered Treatment=4 random /noprint ; 574 output out=RandList BlockID nvals=(1 to 50) 575 Treatment cvals=('Arm A' 'Arm A' 'Arm B' 'Arm B' ); 576 quit; NOTE: The data set WORK.RANDLIST has 200 observations and 2 variables. NOTE: At the end of processing, random number seed=1219503538. NOTE: PROCEDURE PLAN used (Total process time): real time 0.10 seconds cpu time 0.10 seconds 577 578 %put &=sysrandom; SYSRANDOM=9261594 579 %put &=sysranend; SYSRANEND=1219503538
Thank you for the information about %put &=sysrandom; and %put &=sysranend; features of SAS definitely I utilize in my future codes. At present my situation I have created one Randomization schedule, but I don't know the Seed value which SAS has taken. I know the approximate time I executed this code. Is there any way to capture it.
This is an interesting, but difficult task. I don't know how exactly SAS determines the seed from "the time of day" (RANUNI documentation), but I've just figured out how you can recover the actual seed from a ranuni(0) random number.
SAS Log:
1065 data _null_; t=time(); x=ranuni(0); nextseed=x*(2**31-1); put (t x nextseed)(=best16.); run; t=71854.3729999065 x=0.03466819368055 nextseed=74449379 NOTE: DATA statement used (Total process time): real time 0.06 seconds cpu time 0.06 seconds 1066 data _null_; t=time(); x=ranuni(0); nextseed=x*(2**31-1); put (t x nextseed)(=best16.); run; t=71854.4820001125 x=0.4605221792406 nextseed=988963849 NOTE: DATA statement used (Total process time): real time 0.03 seconds cpu time 0.03 seconds 1067 data _null_; x=ranuni( 195590984); put x=best16.; run; x=0.03466819368055 NOTE: DATA statement used (Total process time): real time 0.06 seconds cpu time 0.06 seconds 1068 data _null_; x=ranuni(1764153903); put x=best16.; run; x=0.4605221792406
The mathematical formula for the seed appears to be: seed = mod(58743242*nextseed, 2**31-1). (The "magic" number 58743242 is the multiplicative inverse of the "multiplier" 397204094 (mentioned in the RANUNI documentation) modulo 2**31-1.
For the first of the two examples above you can use this formula directly in a data step:
1080 data _null_; 1081 nextseed=74449379; 1082 seed = mod(58743242*nextseed, 2**31-1); 1083 put seed; 1084 run; 195590984
For the second example you need more sophisticated code (or just use the Windows calculator calc.exe) to obtain the seed 1764153903, because
58743242*988963849 = 58094942711058458 > constant('exactint')
on a Windows system so that the precision is insufficient.
Now the difficult question is how to obtain the seed value from the time() value (or is datetime() used?), e.g.:
Obviously, you must not ignore fractions of a second when you "loop through the times" ...
@FreelanceReinh wrote:
Now the difficult question is how to obtain the seed value from the time() value (or is datetime() used?), e.g.:
- 195590984 from 71854.3729999065 and
- 1764153903 from 71854.4820001125.
Obviously, you must not ignore fractions of a second when you "loop through the times" ...
Addendum:
It has turned out that it's even more complicated than that. In a macro loop I created (time(), ranuni(0)) pairs and calculated the corresponding (internal) seed value. Several times successive time() values were exactly identical, yet the internal seed values (and hence the random numbers) were totally different. The first ten observations are shown below.
Obs time() ranuni(0) seed 1 36776.9690001011 0.35282075235286 1799776034 2 36776.9839999676 0.87105420598344 36942890 3 36776.9839999676 0.48406173311363 1140815067 4 36777.0000000000 0.81978151706037 94069556 5 36777.0000000000 0.51275829529053 1348101720 6 36777.0150001049 0.78849600851000 1812341011 7 36777.0150001049 0.76414637489437 910195803 8 36777.0309998989 0.98833351442047 1748722162 9 36777.0309998989 0.44593092074894 2132102851 10 36777.0469999313 0.18930490696305 2064843663
So, the internal seed is definitely not derived from time() or datetime() alone.
@SukumarBalusamy wrote:
Hello, yes you are correct. The code I executed couple of months ago [08-Feb-2022 14:43] , I missed to include the code to capture the seed value. I need to document it and the question is to reproduce the list.
Hello @SukumarBalusamy,
Sadly, I think from the results shown so far it's fairly obvious that the approximate time stamp "08-Feb-2022 14:43" is not going to give us a useful clue about the unknown seed value:
But there is hope! Your DATA _NULL_ step can only generate up to 10 million different seeds, which is a relatively small subset (<0.5%) of the set of all 2.1 billion possible seeds. Preliminary investigations that I have done indicate that there is a chance to reproduce the results from PROC PLAN in a DATA step. Moreover, much less than your 50 randomized blocks should be sufficient to characterize a seed uniquely: With only 9 blocks we have already 6**9=10,077,696 different possible treatment combinations, i.e. more combinations than seeds. This means that the known result of, e.g., the first 9 (or possibly 7 or 8) blocks in dataset RandList is likely to reduce the number of "candidate" seeds from 10 million to such a small value that it's feasible to test all of these with the complete PROC PLAN step.
So, this might be a promising strategy:
Obviously, item 1 is not an easy task ... Alternatively, you could go for a brute-force approach and replace item 1 with a macro loop running 10 million PROC PLAN steps (creating only 7 to 9 blocks each, to save time and disk space). However, I don't know how long this would take to run on your hardware, even if 5 million iterations might be enough to get a hit (if you're lucky). Maybe try with, say, 10,000 iterations and then extrapolate the time.
Basically, the question is, how important it really is for you to find the unknown seed value.
Hello @SukumarBalusamy,
Good news! You will be able to recover your lost seed!
Let me first note that, with a small probability, your DATA _NULL_ step will produce an invalid seed: The BEST9. format will use scientific notation (e.g. 4.657E-10) if ranuni(0) happens to be very small. PROC PLAN would then error out with a log message like
ERROR: The value SEED = 4.657E-10 is not an integer.
But apparently this unlikely case did not happen when you created your RandList dataset.
Further, note that leading zeros in macro variable Seedval will be ignored by PROC PLAN. This means that if we recover a seed of, say, 1234, the true "historical" value of Seedval might have been 1234 or 01234 or 001234 or 0001234, but this is unimportant for PROC PLAN. The probability that two significantly different seeds produce the same PROC PLAN output (with 200 observations) is extremely small, if not zero, and the DATA step below can find all of them, if needed (see comment about the STOP statement).
It turned out that the DATA step that I outlined in item 1 of my previous post is so fast that we don't have to limit it to "7 to 9 blocks." Instead, we let it generate treatments as long as they match the treatments in dataset RandList. Thus we can omit items 2 and 3 of the strategy and just write the recovered seed to the dataset or to the log. On my workstation it took only a few seconds to recover a seed, the longest time (about 15 seconds) if the seed was 9999999, i.e., the last seed checked. Nevertheless, the DATA step also allows for partial matches with RandList (just use a smaller value for macro variable nt and comment out the STOP statement immediately following the OUTPUT statement), which may produce more than one "candidate" seed value.
/* Create treatment format */
proc format;
value trtf
1, 2 = 'A'
3, 4 = 'B';
run;
%let c=397204094; /* "multiplier" found in RANUNI documentation */
%let d=%sysevalf(2**31-1); /* =2147483647 */
%let nt=200; /* number of treatments to be generated: 50 blocks with 4 treatments each */
/* Find "candidate" seed value(s) for PROC PLAN to reproduce (parts of) dataset RandList */
data candseeds(keep=seed);
length t $10;
/* Preparation: compute 10**k * 397204094 modulo 2**31-1, k=1, ..., 9 */
array n[9] _temporary_;
n[1]=mod(10*&c, &d);
do k=2 to 9;
n[k]=mod(10*n[k-1], &d);
end;
/* Prepare assignment of 2nd and 3rd treatment in a block */
array t2[4,3] _temporary_ (2 3 4 1 3 4 2 1 4 2 3 1);
array t3[4,4,2] _temporary_ (. . 3 4 2 4 3 2
3 4 . . 1 4 3 1
2 4 1 4 . . 1 2
3 2 3 1 2 1 . .);
array ppt[&nt] $1 _temporary_; /* for treatments A, A, B, B generated by PROC PLAN */
array trt[&nt] _temporary_; /* for treatments 1, 2, 3, 4 generated by the DATA step */
/* Read the first &nt treatments from RandList into array ppt */
do p=1 to &nt;
set RandList(keep=treatment) point=p;
ppt[p]=char(treatment,5); /* shorten "Arm A" to "A", etc. */
end;
/* Run through all possible seeds>0 */
do seed=1 to 9999999;
m=seed;
/* Generate the first &nt treatments like PROC PLAN would do it */
do i=1 to &nt;
/* Emulate RANUNI(seed) */
t=put(m,10.);
s=input(char(t,10), 1.)*&c;
do k=1 to length(left(t))-1;
s+input(char(t,10-k), 1.)*n[k];
end;
m=mod(s,&d);
r=m/&d; /* random number between 0 and 1, computed with the RANUNI algorithm */
/* Assign treatments based on the random number and the previous treatments in the block */
select(mod(i,4));
when(1) trt[i]=ceil(4*r);
when(2) trt[i]=t2[trt[i-1],ceil(3*r)];
when(3) trt[i]=t3[trt[i-2],trt[i-1],ceil(2*r)];
otherwise trt[i]=10-trt[i-3]-trt[i-2]-trt[i-1]; /* = the only remaining treatment */
end;
if put(trt[i],trtf.) ne ppt[i] then leave; /* move on to next seed if a discrepancy was found */
end;
if i>&nt then do; /* i.e., no discrepancies were found */
output; /* write "candidate" seed to dataset CANDSEEDS */
stop; /* Remove this STOP statement if you expect more than one "candidate" seed, */
end; /* e.g., if &nt is less than the number of observations in RandList. */
end;
stop; /* necessary because of SET statement with POINT= option */
run;
Please run the code above using your RandList dataset. The expected outcome is a single observation in dataset CANDSEEDS containing the seed value with which PROC PLAN should reproduce dataset RandList.
Thanks, @Reeza. 🙂 It was an exciting challenge.
Luckily, the treatment assignments by PROC PLAN based on uniformly distributed random numbers in (0, 1) were pretty much straightforward (see CEIL function calls), except for some unexpected permutations, which I handled with the arrays t2 and t3. Most importantly, the algorithm was the same for all treatment blocks.
Also, without the simplicity of the random number generator implemented in PROC PLAN (same as in the RANUNI function, which had to be emulated in the data step) the task would have been much more difficult.
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.