SAS Data Integration Studio, DataFlux Data Management Studio, SAS/ACCESS, SAS Data Loader for Hadoop and others

Create random choice subsets for each obs

Reply
Occasional Contributor
Posts: 9

Create random choice subsets for each obs

[ Edited ]

Hi guys,

I have a data set with over 250,000 observations who belong to 1000 census block groups. Each individual chose only one block group. In other word, everyone is facing 1000 choices. I wanted to randomly generate sub-choice-set including only 20 block groups (instead of 1000) for every observation. The problem here is for each individual, the block group that was chosen by that person needs to be included in the sub-choice-set. I have a "choice" variable ranging from 1 to 1009 indicating which block group a person chose.

The example data looks like:

ID  year   block_group       choice

1   2011  410050201001      1

2   2014  410050201001      1

3   2005  415050204032      15

4   2012  415050215002      20

5   2009  410510006022      33

.

.

.

 

I have a vague idea about how to tackle this problem but not sure how to implement it. The steps I thought are:

1.  Create an array of 20 picks pick1 – pick 20.

2. For each individual, set pick1=choice. Then draw a random number between 1 and 1009 using a random number function and compare to the array of picks for that person. If not already chosen, set pick2 to the number. Continue until all 20 picks are assigned unique numbers.

 

data test;
array _pick{20} pick1-pick20;/*Create an array of 20 picks for choice subsets*/
/*do obsnum=1 to last;*/
pick1=bgid;
x=randbetween(1,1009);

...

I just have a hard time to wrap my head around this. Any input or suggestion will be much appreciated.

Thank you so much for your time.

 

Super User
Posts: 9,681

Re: Create random choice subsets for each obs

Try this one. CODE NOT TESTED.

 

proc surveyselect data=have method=srs sampsize=20 out=want;
 cluster choice;
run;
Super User
Posts: 17,832

Re: Create random choice subsets for each obs

Slight modification to @Ksharp suggestion since you absolutely want the choice to be included, use a sampsize of 19 and then append in the choice.  

Occasional Contributor
Posts: 9

Re: Create random choice subsets for each obs

Thank you Reeza. The problem is if I simply append the choice set to the 19 random selected block groups, there is a possibility that the block group chosen by that individual is already in the 19 block group blocks. I've figured out a way to do this this morning. Thank you for your help!

Occasional Contributor
Posts: 9

Re: Create random choice subsets for each obs

Thanks a lot. proc surveyselect will give me a random sample but I need to be sure the block group that was chosen by an individual was included as well. 

Super User
Posts: 9,681

Re: Create random choice subsets for each obs

I don't understand what you mean. Can you post an example to explain this ? Post data and output .

Occasional Contributor
Posts: 9

Re: Create random choice subsets for each obs

Each individual has a chosen block group. By generating a random sample including 20 block groups, I cannot be sure that the block group that was chosen by each person is within the 20 randomly selected block groups. I have already figured out the code. Thank you anyway!

Ask a Question
Discussion stats
  • 6 replies
  • 324 views
  • 0 likes
  • 3 in conversation