05-31-2012 07:25 PM
I need to generate a random sample from a given population. I am using proc surveyselect. The code that generates the sample runs the exact same time each week. If I do not specify a seed option the computer clock is used but since the timing is so close each run the results have far too much verlap week to week. Each observation has several numeric variables and I was thinking of using one of those as a "seed" but I'm not sure if this is the best way.
Would anyone have a bit of expert advice on a good way to move forward?
Thanks very much.
05-31-2012 07:43 PM
Could you make a random number in a datastep, store it as a macro variable, then use that as the seed?
Here's a good read on the random number generator:
Are you using method = srs? Another option would be to try the other sampling methods...
06-01-2012 05:54 PM
The total population is just over 14,000 and I am using surveyselect to sample .471.
The same observations seem to come up pretty frequently. Almost as if surveyselect assigns a random number and then begins selection at the same place it did on the previous run.
Should I be sorting the data prior to selection?
proc surveyselect data = incoming out = outgoing
method = srs
rate = .00471;
id var1 var2 var3;