Solved: Re: Random character string with no duplicates - Page 2

FreelanceReinh · Posted 08-19-2022 05:27 AM

@mkeintz wrote:
BTW, with the streaminit value I used, there were only 11 instances of duplicates to be skipped.

This is very plausible. I had observed between 8 and 23 in a few trials. According to formulas I've just found in Volume I of Feller (1968), p. 225, the expected value is 15.56 (see code below). So, in this case the cost of avoiding duplicates is rather the maintenance of the lookup table than the 0.003% additional samples needed on average.

%let n=%sysevalf(26**7);
%let r=500000;

data _null_;
do k=0 to &r-1;
  s+1/(&n-k);
end;
E_exact_=&n*s-&r;
E_approx=&n*log((&n+0.5)/(&n-&r+0.5))-&r;
put (E:)(=best16./);
run;

Ksharp · Posted 08-19-2022 08:43 AM

/*Try UUIDGEN() function.
if you only want alpha, 
you could get rid of thoese digits*/
data a;
do i=1 to 50000;
want=uuidgen(123);
output;
end;
run;

Rick_SAS · Posted 08-19-2022 09:48 AM

The OP has selected an answer, but I am still curious about the reason for this question. @KatLinden Why do you want the character strings to be random? How will these strings be used?

Rick_SAS · Posted 09-19-2022 09:54 AM

The discussion on this thread inspired me to think about this problem and write up a solution. My approach: Use base 26 to convert a set of unique integers into a set of unique strings. If you expect to assign IDs to N subjects, you can use strings that have k characters, where N < 26^k.

How to generate the ID values (strings) from integers: "Base 26: A mapping from integers to strings"
How to randomly select N unique IDs: "Generate random ID values for subjects in SAS"

The primary advantage of this technique over some of the other proposals is that it ensures uniqueness of the ID values. You don't have to check whether a random string has already been assigned.

Re: Random character string with no duplicates

Re: Random character string with no duplicates

Re: Random character string with no duplicates

Re: Random character string with no duplicates

Registration is open

SAS Training: Just a Click Away