What is your population size?
Generally if you want something representative of a population as a whole then a simple random sample without constraints of sufficient size would work. But if your desired sizes of specific characteristics require disproportionate sampling then you may have to go to stratification on one or more of the characteristics.
If this were my project I would likely start with a random sample of around 200 and examine the characteristics of those selected (Proc freq anyone). If that works, we're golden. If I'm close to the desired sizes in the characteristics then increase the sample size a bit.
If one of the characteristics doesn't get close at all then stratification on that variable and with multiple constraints the strata size would need to be larger.
In each group individually, or in total?
ie 75 females age 20-30
or 75 females
75 age 20-30
75 males
Either way, take a look at PROC SURVEYSELECT
Is it what you are looking for ? proc sort data=sashelp.class out=class; by sex; run; proc surveyselect data=class nmin=8 samprate=.1 out=want; strata sex; run;
can the same employee be selected in a gender sample group and an age band sample group?
Assuming the answer is yes, here is a method for choosing 2 students per sex and age group:
%macro select(dsn, id, crit, nbSel);
proc sort data=&dsn; by &crit; run;
proc sql;
create table strata_&crit as
select
&crit,
max(0, &nbSel - sum(selected)) as SampleSize
from &dsn
group by &crit;
quit;
proc surveyselect data=&dsn out=sample_&crit sampsize=strata_&crit;
where not selected;
strata &crit;
run;
proc sql;
update &dsn
set selected = 1
where &id in (select &id from sample_&crit);
quit;
%mend;
data class;
set sashelp.class;
selected = 0; /*Add this variable to the dataset */
run;
%select(class, name, sex, 2);
%select(class, name, age, 2);
Is 75 from a power analysis?
What is your population size?
Generally if you want something representative of a population as a whole then a simple random sample without constraints of sufficient size would work. But if your desired sizes of specific characteristics require disproportionate sampling then you may have to go to stratification on one or more of the characteristics.
If this were my project I would likely start with a random sample of around 200 and examine the characteristics of those selected (Proc freq anyone). If that works, we're golden. If I'm close to the desired sizes in the characteristics then increase the sample size a bit.
If one of the characteristics doesn't get close at all then stratification on that variable and with multiple constraints the strata size would need to be larger.
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.
Find more tutorials on the SAS Users YouTube channel.