BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
Ronein
Meteorite | Level 14

Hello

I want to take  ransom sample of 10 observations for each category of "origin".

What is the way to do it please?

In first step I created serial numbers for each group but from this step I don't know how to tell sas to choose 10 observations randomly from each group.

Please note that the wanted data set will contain 10 observations from each group so since there are 3 origins (Asia,Europe,USA) there will be 30 observations in wanted data set

 

Thanks

Erik

 

 

proc sort data=sashelp.cars out=cars;
by origin;
run;


data cars2;
set cars;
by origin;
if first.origin then serial=1;
serial+1;
Run;
1 ACCEPTED SOLUTION

Accepted Solutions
Watts
SAS Employee

One approach is to use PROC SURVEYSELECT. These statements select a random sample of 10 observations from each level of origin. 

proc surveyselect data=cars2 n=10 out=sample;
     strata origin;
run;

 

View solution in original post

4 REPLIES 4
Watts
SAS Employee

One approach is to use PROC SURVEYSELECT. These statements select a random sample of 10 observations from each level of origin. 

proc surveyselect data=cars2 n=10 out=sample;
     strata origin;
run;

 

Ronein
Meteorite | Level 14
Thank you so much!
two questions please:
1- Is the step of creating serial numbers for each group essential?
2-What is the process if the "group" is defined by multiple fields?
for example: origin and make fields
Watts
SAS Employee

1. No, PROC SURVEYSELECT doesn't require that (but it's fine to use some type of sequential ID variable if that's useful for your application). By default, the proc includes all variables from the DATA= input data set in the OUT= sample data set. Alternatively, you can use the ID statement to specify which variables to include. 

 

2. You can specify more than one variable in the STRATA statement. The groups (strata) are defined by the combination of STRATA variable levels, and samples are selected independently from the separate groups.

strata origin make;

 

ballardw
Super User

@Watts wrote:

 

2. You can specify more than one variable in the STRATA statement. The groups (strata) are defined by the combination of STRATA variable levels, and samples are selected independently from the separate groups.

strata origin make;

 


It gets to be a bit of fun setting the strata sizes if you want different sizes for different combinations of the strata variables. At which point the SAMPSIZE= dataset name may come into play as an easier way to set the sizes than a literal list.

SAS Innovate 2025: Call for Content

Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!

Submit your idea!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 4 replies
  • 686 views
  • 0 likes
  • 3 in conversation