🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
Calcite | Level 5

## Survey Selection

Trying to generate a random sample of employee level data for a survey. Want to ensure I have a minimum of 75 people in each sub group like Gender, line of business, age band etc.
1 ACCEPTED SOLUTION

Accepted Solutions
Super User

## Re: Survey Selection

What is your population size?

Generally if you want something representative of a population as a whole then a simple random sample without constraints of sufficient size would work. But if your desired sizes of specific characteristics require disproportionate sampling then you may have to go to stratification on one or more of the characteristics.

If this were my project I would likely start with a random sample of around 200 and examine the characteristics of those selected (Proc freq anyone). If that works, we're golden. If I'm close to the desired sizes in the characteristics then increase the sample size a bit.

If one of the characteristics doesn't get close at all then stratification on that variable and with multiple constraints the strata size would need to be larger.

11 REPLIES 11
Super User

## Re: Survey Selection

In each group individually, or in total?

ie 75 females age 20-30

or 75 females

75 age 20-30

75 males

Either way, take a look at PROC SURVEYSELECT

Calcite | Level 5

## Re: Survey Selection

75 female/75 males
75 age 20-30, 75 age 40-50
Not crossed with each other
Calcite | Level 5

That's a mimum
Super User

## Re: Survey Selection

```Is it what you are looking for ?

proc sort data=sashelp.class out=class;
by sex;
run;
proc surveyselect data=class nmin=8 samprate=.1 out=want;
strata sex;
run;

```
Opal | Level 21

## Re: Survey Selection

can the same employee be selected in a gender sample group and an age band sample group?

PG
Opal | Level 21

## Re: Survey Selection

Assuming the answer is yes, here is a method for choosing 2 students per sex and age group:

``````%macro select(dsn, id, crit, nbSel);
proc sort data=&dsn; by &crit; run;

proc sql;
create table strata_&crit as
select
&crit,
max(0, &nbSel - sum(selected)) as SampleSize
from &dsn
group by &crit;
quit;

proc surveyselect data=&dsn out=sample_&crit sampsize=strata_&crit;
where not selected;
strata &crit;
run;

proc sql;
update &dsn
set selected = 1
where &id in (select &id from sample_&crit);
quit;
%mend;

data class;
set sashelp.class;
selected = 0; /*Add this variable to the dataset */
run;

%select(class, name, sex, 2);
%select(class, name, age, 2);
``````
PG
Calcite | Level 5

## Re: Survey Selection

I do not want to limit myself to just a select number from each group. I want a sample that is representative of the employee base through the lense of line of business, gender, age, tenure, etc. just looking to ensure I have a minimum number at the least in each group to be able to confidently say women would prefer this over men. Or age band 20-30 is more likely to use the offering we are surveying on
Super User

## Re: Survey Selection

Is 75 from a power analysis?

Calcite | Level 5

## Re: Survey Selection

It's based on a response rate assumption. Want to be left with enough sample to see difference in my groups for likert responses differences to be significant
Super User

## Re: Survey Selection

What is your population size?

Generally if you want something representative of a population as a whole then a simple random sample without constraints of sufficient size would work. But if your desired sizes of specific characteristics require disproportionate sampling then you may have to go to stratification on one or more of the characteristics.

If this were my project I would likely start with a random sample of around 200 and examine the characteristics of those selected (Proc freq anyone). If that works, we're golden. If I'm close to the desired sizes in the characteristics then increase the sample size a bit.

If one of the characteristics doesn't get close at all then stratification on that variable and with multiple constraints the strata size would need to be larger.

Calcite | Level 5

## Re: Survey Selection

I ultimately needed to use a combination of solutions! Thank you everyone for your advice!
Discussion stats
• 11 replies
• 2027 views
• 0 likes
• 5 in conversation