09-29-2016 02:52 AM
I was going through the SAS Advance Study material and found the following code to generate the Random Sample without replacement using RANUNI Function.
data work.rsubset(drop=obsleft sampsize); sampsize=10; obsleft=totobs; do while(sampsize>0); pickit+1; if ranuni(0)<sampsize/obsleft then do; set sasuser.revenue point=pickit nobs=totobs; output; sampsize=sampsize-1; end; obsleft=obsleft-1; end; stop; run; proc print data=work.rsubset heading=h label; title 'A Random Sample without Replacement'; run;
If I understand correctly the program is using the below logic to ensure that the same value is not entered again in the random sample.
if ranuni(0)<sampsize/obsleft then do;
Can someone help me in understanding that how above condition ensures that same values are not entered again in the Random Sample?
How the fraction value of the sampsize/obsleft and its comparison with ranuni(0) ensures the uniqueness of the observation?
Thanks in advance
P.S. I am using SAS 9.3
09-29-2016 03:49 AM
The ranuni() function returns random values between 0 and 1.
Your code counts a point value up, and if ranuni() serves a value that fall into the "window" for your sample size, the record currently pointed to will be picked. At the same time, the sample size is reduced.
Once the sample size is 0, no further records will be selected.
The "don't select a record twice" is handled by the fact that the pointer variable pickit is always counted up, so it never contains the same value again.
09-29-2016 10:35 AM
If your purpose is to actually select a sample not just a programming exercise you might want to investigate Proc Survey Select.
proc surveyselect data=sasuser.revenue out=work.subset stats sampsize=10; run;
Some advantages are the ability to use Strata with different sampling rates, different methods such as probability proportionate to size, sequential as well as simple random sampling and with or without replacement.
And added bonus is the STATS option shows the sample probability and weight for using the data.
Bunch of other stuff as well such as just marking your output data instead of actually subsetting.
09-29-2016 01:21 PM
This example is basically "Method 3" from the SAS KnowledgeBase article "Simple Random Sample without replacement." The article has extensive comments that explain what is happening at each step in the process.