BookmarkSubscribeRSS Feed
Deester
Obsidian | Level 7

Hi guys,

I have a data set with over 250,000 observations who belong to 1000 census block groups. Each individual chose only one block group. In other word, everyone is facing 1000 choices. I wanted to randomly generate sub-choice-set including only 20 block groups (instead of 1000) for every observation. The problem here is for each individual, the block group that was chosen by that person needs to be included in the sub-choice-set. I have a "choice" variable ranging from 1 to 1009 indicating which block group a person chose.

The example data looks like:

ID  year   block_group       choice

1   2011  410050201001      1

2   2014  410050201001      1

3   2005  415050204032      15

4   2012  415050215002      20

5   2009  410510006022      33

.

.

.

 

I have a vague idea about how to tackle this problem but not sure how to implement it. The steps I thought are:

1.  Create an array of 20 picks pick1 – pick 20.

2. For each individual, set pick1=choice. Then draw a random number between 1 and 1009 using a random number function and compare to the array of picks for that person. If not already chosen, set pick2 to the number. Continue until all 20 picks are assigned unique numbers.

 

data test;
array _pick{20} pick1-pick20;/*Create an array of 20 picks for choice subsets*/
/*do obsnum=1 to last;*/
pick1=bgid;
x=randbetween(1,1009);

...

I just have a hard time to wrap my head around this. Any input or suggestion will be much appreciated.

Thank you so much for your time.

 

6 REPLIES 6
Ksharp
Super User

Try this one. CODE NOT TESTED.

 

proc surveyselect data=have method=srs sampsize=20 out=want;
 cluster choice;
run;
Reeza
Super User

Slight modification to @Ksharp suggestion since you absolutely want the choice to be included, use a sampsize of 19 and then append in the choice.  

Deester
Obsidian | Level 7

Thank you Reeza. The problem is if I simply append the choice set to the 19 random selected block groups, there is a possibility that the block group chosen by that individual is already in the 19 block group blocks. I've figured out a way to do this this morning. Thank you for your help!

Deester
Obsidian | Level 7

Thanks a lot. proc surveyselect will give me a random sample but I need to be sure the block group that was chosen by an individual was included as well. 

Ksharp
Super User

I don't understand what you mean. Can you post an example to explain this ? Post data and output .

Deester
Obsidian | Level 7

Each individual has a chosen block group. By generating a random sample including 20 block groups, I cannot be sure that the block group that was chosen by each person is within the 20 randomly selected block groups. I have already figured out the code. Thank you anyway!

hackathon24-white-horiz.png

The 2025 SAS Hackathon has begun!

It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.

Latest Updates

How to connect to databases in SAS Viya

Need to connect to databases in SAS Viya? SAS’ David Ghan shows you two methods – via SAS/ACCESS LIBNAME and SAS Data Connector SASLIBS – in this video.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 6 replies
  • 1960 views
  • 0 likes
  • 3 in conversation