BookmarkSubscribeRSS Feed
☑ This topic is solved. Need further help from the community? Please sign in and ask a new question.
ghartge
Quartz | Level 8

Greetings,

 

OK, so I have a dataset where each row has

-course

-faculty

-student

-Group_Name

There are five different groups for the Group_Name value.

 

Is it possible to use SURVEYSELECT to take a random sample where at least one record from each group is produced and keep my results to an overall N?

I have used PROC SURVEYSELECT to produce data where each group in represented using STRATA Group_Name;, but each group is also equal to my N. In other words, five groups of 31 in my output data (155 records) instead of a total of 31 records with each of the five groups represented at least once.

 

PROC SURVEYSELECT DATA = Data_In OUT = Data_Out
n=31
seed = 12345
method = srs;
STRATA Group_Name;
RUN ;

 

To restate my question, I would like to produce only 31 records in my Data_Out dataset, but have each "Group" represented at least once.

 

Commenting out the line "STRATA Group_Name;" line produces 31 records equaling my N value, and has up to now produced at least one record from each group, but is this how PROC SURVEYSELECT functions or have I simply been lucky?

 

Thanks,

 

Gary

1 ACCEPTED SOLUTION

Accepted Solutions
FreelanceReinh
Jade | Level 19

Hello @ghartge,

 

I think adding the ALLOC=PROP option to the STRATA statement should solve the problem:

strata Group_Name / alloc=prop;

The documentation of the related ALLOCMIN= option (by which you could request at least n observations per stratum) says: "By default, PROC SURVEYSELECT allocates at least one sampling unit to each stratum." At the same time, proportional allocation comes close to what a simple random sample would yield on average.

 

Edit: If you want to allow variability in the frequency distribution of variable Group_Name in the result, you can perform the selection in two steps:

  1. One randomly selected observation from each group.
  2. A simple random sample of 31−5=26 observations from the remaining observations, without stratification.

Code:

proc surveyselect data=data_in
method=srs n=1 outall
seed=12345 out=step1;
strata Group_Name;
run;

proc surveyselect data=step1(where=(not selected))
method=srs n=26
seed=2718 out=step2;
run;

data want;
set step1(where=(selected))
    step2;
by Group_Name;
drop Selected SelectionProb SamplingWeight;
run;

View solution in original post

2 REPLIES 2
FreelanceReinh
Jade | Level 19

Hello @ghartge,

 

I think adding the ALLOC=PROP option to the STRATA statement should solve the problem:

strata Group_Name / alloc=prop;

The documentation of the related ALLOCMIN= option (by which you could request at least n observations per stratum) says: "By default, PROC SURVEYSELECT allocates at least one sampling unit to each stratum." At the same time, proportional allocation comes close to what a simple random sample would yield on average.

 

Edit: If you want to allow variability in the frequency distribution of variable Group_Name in the result, you can perform the selection in two steps:

  1. One randomly selected observation from each group.
  2. A simple random sample of 31−5=26 observations from the remaining observations, without stratification.

Code:

proc surveyselect data=data_in
method=srs n=1 outall
seed=12345 out=step1;
strata Group_Name;
run;

proc surveyselect data=step1(where=(not selected))
method=srs n=26
seed=2718 out=step2;
run;

data want;
set step1(where=(selected))
    step2;
by Group_Name;
drop Selected SelectionProb SamplingWeight;
run;
ghartge
Quartz | Level 8

Great @FreelanceReinh ! Thank you and thanks for the quick response.

Gary

SAS Innovate 2025: Register Now

Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 2 replies
  • 521 views
  • 3 likes
  • 2 in conversation