I have two constraints to choose my random sample without replacement. I want to do a survey of 100 people. My two constraints are:
Thanks in advance.
I have 50 states in my sample frame. Also I have ANOTHER constraint that needs to be included. So just to be clear, let me repeat my problem once again (with the additional constraint).
1. My sample frame has 50 states and zip codes that end with 0 through 9. Firstly, I want to randomly sample 100 observations (10 subgroups of 10 observations each) based on their zip code. (First subgroup of 10 observations has zip code ending with 0, second subgroup has zip code ending with 1, etc.)
2. Second constraint (with the new constraint) is that I want at most 5 observations from state1, at most 3 observations from state2 and all other states must be included less than 2. This is the stage that I am having a problem.
Thanks again for your help.
Run proc freq (or favorite summarization procedure) on your resultant sample after adding a state variable (ZIPSTATE, ZIPNAME or ZIPNAMEL functions), or if the state is already in the sample. Check the counts, if not as desired, then resample.
I have dones something similar because of costs associated with a study BUT I have a sneaking feeling in the back of my mind that frequent resampling may be isn't quite getting the correct sample weights. A moderate amount of code could create this as a macro loop.
Or if your source data is large enough, request a number of Replicates (REP= option) and examine each replicate for fitting within your constraits and select appropriate replicates.
Following @ballardw's suggestion, here is an example. The goal here is to get no state with 3 or more selected units and at most two states with 2 selected units for any given zip code termination digit.
data frame; call streamInit(17646); do state = 1 to 50; do id = 1 to 1000; zip = int(10*rand("UNIFORM")); output; end; end; run; proc sort data=frame; by zip state; run; %macro mySurvey; %do %until(&n3=0 AND &n2<=2); proc surveyselect data=frame out=sample sampsize=10; strata zip; run; proc sql; select max(n3s), max(n2s) into :n3, :n2 from ( select sum(n >= 3) as n3s, sum(n >= 2) - sum(n >= 3) as n2s from ( select zip, count(*) as n from sample group by zip, state) group by zip) ; quit; %end; %mend mySurvey; %mySurvey; proc sql; select zip, state, count(*) as n from sample group by zip, state; quit;
Registration is open! SAS is returning to Vegas for an AI and analytics experience like no other! Whether you're an executive, manager, end user or SAS partner, SAS Innovate is designed for everyone on your team. Register for just $495 by 12/31/2023.
If you are interested in speaking, there is still time to submit a session idea. More details are posted on the website.
Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.
Find more tutorials on the SAS Users YouTube channel.