BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
dewittme
Fluorite | Level 6

Hi

 

I trying to create a sampling frame. It is a unique problem in that I want to divide a population into two samples that have been stratified by several variables (Demographics/Gender/ etc). I however don't just want a sample of the population, I want to divide the data into two sample, A and B.

 

I have run a proc freq to get the marginal probabilities and have joined them to my original data.

 

Is there a way to specify to proc surveyselect how to say I want a sample with 50% of my data? The samplesize arguments specify the samples per strata which is not what I want exactly.

 

Any help would be appreciated.

 

1 ACCEPTED SOLUTION

Accepted Solutions
dewittme
Fluorite | Level 6

Ended up using the SRS method and just verifying the marginal probabilities afterwards. This worked fine.

 

 

PROC SURVEYSELECT data = mydata
 method = srs 
 n= 2648
 seed = 1234
 out = srs_method
 ;
RUN;

View solution in original post

4 REPLIES 4
ballardw
Super User

You might provide a small example of your data and what you would expect a possible  result to look like after your process.

You may only need the Groups= option.

Both the SAMPSIZE and SAMPRATE options allow use of a data set to control the numbers/rates per strata combination. So if you set that up correctly using your proc freq information that might be what you are looking for. This would be a different data set than your set to sample from and must have a specific structure. So read the documentation.

Anything where you specify one or more strata your selection rate/size is your responsibility to get the "total" that you want.

dewittme
Fluorite | Level 6

Thanks for the advice to read the documentation.

dewittme
Fluorite | Level 6

Ended up using the SRS method and just verifying the marginal probabilities afterwards. This worked fine.

 

 

PROC SURVEYSELECT data = mydata
 method = srs 
 n= 2648
 seed = 1234
 out = srs_method
 ;
RUN;

mkeintz
PROC Star

Mark your message as a solution.  It doesn't matter that you specified the solution to your own topic.

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------

SAS Innovate 2025: Register Now

Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 4 replies
  • 1253 views
  • 2 likes
  • 3 in conversation