BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
Ronein
Onyx | Level 15

Hello

I want to take  ransom sample of 10 observations for each category of "origin".

What is the way to do it please?

In first step I created serial numbers for each group but from this step I don't know how to tell sas to choose 10 observations randomly from each group.

Please note that the wanted data set will contain 10 observations from each group so since there are 3 origins (Asia,Europe,USA) there will be 30 observations in wanted data set

 

Thanks

Erik

 

 

proc sort data=sashelp.cars out=cars;
by origin;
run;


data cars2;
set cars;
by origin;
if first.origin then serial=1;
serial+1;
Run;
1 ACCEPTED SOLUTION

Accepted Solutions
Watts
SAS Employee

One approach is to use PROC SURVEYSELECT. These statements select a random sample of 10 observations from each level of origin. 

proc surveyselect data=cars2 n=10 out=sample;
     strata origin;
run;

 

View solution in original post

4 REPLIES 4
Watts
SAS Employee

One approach is to use PROC SURVEYSELECT. These statements select a random sample of 10 observations from each level of origin. 

proc surveyselect data=cars2 n=10 out=sample;
     strata origin;
run;

 

Ronein
Onyx | Level 15
Thank you so much!
two questions please:
1- Is the step of creating serial numbers for each group essential?
2-What is the process if the "group" is defined by multiple fields?
for example: origin and make fields
Watts
SAS Employee

1. No, PROC SURVEYSELECT doesn't require that (but it's fine to use some type of sequential ID variable if that's useful for your application). By default, the proc includes all variables from the DATA= input data set in the OUT= sample data set. Alternatively, you can use the ID statement to specify which variables to include. 

 

2. You can specify more than one variable in the STRATA statement. The groups (strata) are defined by the combination of STRATA variable levels, and samples are selected independently from the separate groups.

strata origin make;

 

ballardw
Super User

@Watts wrote:

 

2. You can specify more than one variable in the STRATA statement. The groups (strata) are defined by the combination of STRATA variable levels, and samples are selected independently from the separate groups.

strata origin make;

 


It gets to be a bit of fun setting the strata sizes if you want different sizes for different combinations of the strata variables. At which point the SAMPSIZE= dataset name may come into play as an easier way to set the sizes than a literal list.

hackathon24-white-horiz.png

2025 SAS Hackathon: There is still time!

Good news: We've extended SAS Hackathon registration until Sept. 12, so you still have time to be part of our biggest event yet – our five-year anniversary!

Register Now

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 4 replies
  • 1136 views
  • 0 likes
  • 3 in conversation