Hi All,
I was going through the SAS Advance Study material and found the following code to generate the Random Sample without replacement using RANUNI Function.
data work.rsubset(drop=obsleft sampsize);
sampsize=10;
obsleft=totobs;
do while(sampsize>0);
pickit+1;
if ranuni(0)<sampsize/obsleft then do;
set sasuser.revenue point=pickit
nobs=totobs;
output;
sampsize=sampsize-1;
end;
obsleft=obsleft-1;
end;
stop;
run;
proc print data=work.rsubset heading=h label;
title 'A Random Sample without Replacement';
run;
If I understand correctly the program is using the below logic to ensure that the same value is not entered again in the random sample.
if ranuni(0)<sampsize/obsleft then do;
Can someone help me in understanding that how above condition ensures that same values are not entered again in the Random Sample?
How the fraction value of the sampsize/obsleft and its comparison with ranuni(0) ensures the uniqueness of the observation?
Thanks in advance 🙂
P.S. I am using SAS 9.3
The ranuni() function returns random values between 0 and 1.
Your code counts a point value up, and if ranuni() serves a value that fall into the "window" for your sample size, the record currently pointed to will be picked. At the same time, the sample size is reduced.
Once the sample size is 0, no further records will be selected.
The "don't select a record twice" is handled by the fact that the pointer variable pickit is always counted up, so it never contains the same value again.
If your purpose is to actually select a sample not just a programming exercise you might want to investigate Proc Survey Select.
For example
proc surveyselect data=sasuser.revenue out=work.subset stats
sampsize=10;
run;
Some advantages are the ability to use Strata with different sampling rates, different methods such as probability proportionate to size, sequential as well as simple random sampling and with or without replacement.
And added bonus is the STATS option shows the sample probability and weight for using the data.
Bunch of other stuff as well such as just marking your output data instead of actually subsetting.
This example is basically "Method 3" from the SAS KnowledgeBase article "Simple Random Sample without replacement." The article has extensive comments that explain what is happening at each step in the process.
Available on demand!
Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.