11-08-2016 04:37 PM
I've cruised this site for multiple methods of random number generation and assignment but did not find a good fit (as far as I know).
My use case is a to assign a random number to a population of individuals. I don't care about uniform distribution of the random values.
Is there a solid way to ensure no duplication of random numbers generated without having to go through do-loop iterations?
I realize there is a much higher chance of random number duplication than might otherwise be expected but I'm suspecting there must be a quick/efficient/best-practice out there for this...
Any help would be appreciated
11-08-2016 04:50 PM
It may be appropriate to discuss how you will use that random number. If the purpose is to select a random sample then the procedure Surveyselect may be a better idea.
The likelihood of duplication arises with larger numbers of records. So how big is your data set? Also sometimes "duplication" is the result of a default display format rounding values to a number of digits.
11-08-2016 04:54 PM
This may not fit your idea of "quick" or "efficient" but it is tried and true and relatively easy to understand. Old style coding:
random_order = ranuni(12345);
proc sort data=almost_there;
random_number = _n_;
Assign a random fraction to each observation. Then just number the observations in order.