DATA Step, Macro, Functions and more

Random Sampling Without Replacement Until Exhausted Then Repeat

Reply
N/A
Posts: 0

Random Sampling Without Replacement Until Exhausted Then Repeat

I'm currently trying to make a random stratified sampling schedule and cannot figure it out. Here is a sample of the data:
Obs Date Month S_Regions

1 Sat, Jan 8, 2011 1 3
2 Sat, Jan 29, 2011 1 7
3 Tue, Jan 18, 2011 1 3
4 Thu, Jan 6, 2011 1 3
5 Thu, Jan 20, 2011 1 8
6 Sat, Jan 1, 2011 1 2
7 Tue, Jan 25, 2011 1 3
8 Thu, Jan 13, 2011 1 5
9 Sat, Jan 22, 2011 1 1
10 Tue, Jan 4, 2011 1 6
11 Tue, Jan 11, 2011 1 6
12 Thu, Jan 27, 2011 1 4
13 Sat, Jan 15, 2011 1 1
14 Tue, Feb 22, 2011 2 6

Obs is the observation number, date is the planned date to sample (randomly ordered by date, but ordered by month), month is the planned month to sample, and S_Regions is a random number from 1 to 6. In this case the number is just a random number from 1 to 6 with no constraints on how it is chosen. What I need to do is pick a random number from 1 to 6 without replacement until all numbers have been assigned to a date, then repeat that until a number for S_regions is chosen for each date in a month. Then start the process over with the next month. I don't have any sample code yet because I'm still trying to figure out how to start this. If you have any ideas please let me know. Thanks in advance.
Occasional Contributor
Posts: 17

Re: Random Sampling Without Replacement Until Exhausted Then Repeat

Posted in reply to deleted_user
As of my knowledge generating unique random numbers is difficulty in any computer programming language.

Alternative solution in SAS is generate random numbers by using rannor or ranuni etc. functions and use this variable for sort..then the data is sorted randomly then pick up first 6 obs or last 6 obs in each month(in your case). This is one of the best approach that I use.

Please let me if you find any other alternative solution.

S
N/A
Posts: 0

Re: Random Sampling Without Replacement Until Exhausted Then Repeat

Thanks for taking the time to look at my question. I'm still having quite a bit of trouble wrapping my head around this. I figured out how to make the unique random numbers I need:

S_Regions = int(&S_Regions*ranuni(0)) + 1;

In this case it just picks a number from 1 to 8 (S_Regions, which is a global variable chosen by the user). I've already figured out how to order all the dates I need to sample by random numbers, but I can't just use that order. I need to assign a number (1 through 8 in this case) to each day and repeat until all those numbers have been chosen without replacement, then repeat again until all the days in a month are exhausted. If I just assign the days 1 through 8 after the random order is chosen, the lower numbers will be sampled more throughout the year. I already had SAS make me a table with all the days I need to sample. The day's aren't what needs to be random, they're just every Tuesday, Thursday, and Saturday of the year (holidays will be dealt with on an individual basis). I just can't figure out the code to do what I need. Here is an example in psuedo-code of what I need:

By Month
Pick 1st n days of that month
Randomly assign 1 to n without replacement to each day
Pick next n days of that month
Randomly assign 1 to n without replacement to each day
...repeat until all days that month are exhausted then repeat with next month
Do until all sample days have been assigned a number

To maybe make this make more sense; the numbers actually represent sample regions of the coast. So, what this is actually doing is randomly assigning a sample region to each sample day without replacement until exhausted.
Respected Advisor
Posts: 4,173

Re: Random Sampling Without Replacement Until Exhausted Then Repeat

Posted in reply to deleted_user
Is it this what you're after?

%let StartDay=01Jan2010;
%let EndDay =31Dec2010;

%let SamplingRegions=8;
%let Seed=1;


data SamplingDays(keep=SamplingDay );
format SamplingDay Date9.;
do SamplingDay= "&StartDay"d to "&EndDay"d;
if weekday(SamplingDay) in (3,5,7) then
do;
n+1;
output;
end;
end;
call symput('RepeatRows',cats(ceil(n/&SamplingRegions)+1));
run;


proc plan seed=&Seed;
factors RepeatRow=&RepeatRows SamplingRegion=&SamplingRegions /noprint;
output out=SamplingRegions(keep=SamplingRegion);
run;

data SamplingDaysAndRegions;
set SamplingDays;
set SamplingRegions;
run;

proc print data=SamplingDaysAndRegions noobs;
run;
N/A
Posts: 0

Re: Random Sampling Without Replacement Until Exhausted Then Repeat

Brilliant! Never heard of Proc Plan until now. That seems to do the trick. I changed the code to do it by month because I need the samples from the proc plan to start over each month. Any idea how to do an entire year in on run like this? If not no worries, this schedule will be given out in 2 month increments anyways, so it won't take much time to change the code and run it twice.

/*Define the number of regions in each coastal division*/
%let N_Regions = 6;
%let M_Regions = 8;
%let S_Regions = 8;

/*Choose what year and two month you wish to sample*/
%let year = 2011;

%let monthSamp = 1;

/*Create new table of all possible dates in the year to be sampled.*/
/*Pick all Tuesdays, Thursdays, and Saturdays of the user chosen year*/
/*Get random number for each date, then sort by that number for each month to determine numbering before region assignments*/
data Sample_Dates (drop=n);
Do Date=(MDY(1,1,&year)) to (MDY(12,31,&year));
if weekday(Date) in (3,5,7) then
if Month(Date) = &MonthSamp then
do;
n+1;
RandomNum = rand('UNIFORM');
Month = Month(Date);
output;
end;
end;
run;

proc sort data = Sample_Dates;
By RandomNum;
run;

/*Choose regions to sample each day randomly until exhausted and repeat 5 times (should cover all days in a month)*/
/*Then add randomly sampled regions to the date set*/
/*Repeat for each coastal divide*/

/*North Division*/
proc plan;
factors NRepeatRow=5 ordered
NRegion=&N_Regions random;
output out=NSample (keep=NRegion);
run;

data Sample_Dates;
set Sample_Dates;
set NSample;
run;


Thanks for all your help!!!
SAS Super FREQ
Posts: 8,864

Re: Random Sampling Without Replacement Until Exhausted Then Repeat

Posted in reply to deleted_user
Hi:
If you go to support.sas.com, and enter this search string in the upper right-hand search box (the entry box near the orange SEARCH button)
random sampling without replacement

...most of the hits that come up are Tech Support notes that have examples of doing random sampling. Some of the notes include (but are not limited to) these:
Sample 24763: Stratified random sample without replacement, unequal allocation
Sample 24722: Simple random sample without replacement
Sample 24760: Stratified random sample without replacement, equal allocation
and
Usage Note 22952: How can I take a random sample with probability proportional to size? (which points you to PROC SURVEYSELECT)

cynthia
Ask a Question
Discussion stats
  • 5 replies
  • 305 views
  • 0 likes
  • 4 in conversation