BookmarkSubscribeRSS Feed
zohraafzal
Calcite | Level 5

Hi,

 

I am working on a large data set for my thesis and need help with stratified random sampling within groups. Data set has client variable grouped as Female case, Male_cases, Female_control and Male_control. I want to select all the male and female cases but for the control group I want to match 4 controls on age and race, for each case. i.e. I want to match 4 Female_controls for each Female_case  and 4 Male_controls for each Male_case.

 

ID       Client                     Race    Age     Hospitals ID    Services  

1         Female_cases       Black     45         000152         PS

2         Male_cases           White     34         000121         HS

3         Female_control      Asian    50          000542        HS

4         Male_control          White    44          000199        HS

 

I want to add that I am using SAS university Edition.

4 REPLIES 4
PGStats
Opal | Level 21

How do you want to match ages? Do you want exact matches, matches within classes (21-25,26-30, ..), something else? 

PG
Reeza
Super User

PROC psmatch?

zohraafzal
Calcite | Level 5

Thank you for your message and sorry for the late reply!

I will use the code you suggested and see what happens.

 

PGStats
Opal | Level 21

Here is a simple approach for exact race and age matching:

 

data cases;
input id race $ age;
datalines;
1 A 21
3 B 31
4 B 31
;

data control;
input id race $ age;
datalines;
5 A 18
6 A 21
7 A 21
8 B 10
9 B 31
10 B 31
11 B 31
12 B 32
;

/* Create a copy of each case for each matched control */ 
data cases2;
set cases;
do i = 1 to 2;
    output;
    end;
drop i;
run;

/* Put the controls in random order */
data controlr;
set control;
rnd = rand('uniform');
run;

proc sort data=controlr; by id race age rnd; run;

/* Match cases and controls */
data sample;
merge 
    cases2 (in=inCases)
    controlr (rename=id=controlId);
by race age;
if controlId = lag(controlId) then controlId = .;
if inCases;
drop rnd;
run;

proc print data=sample; run;
PG

SAS Innovate 2025: Register Now

Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 4 replies
  • 1077 views
  • 0 likes
  • 3 in conversation