BookmarkSubscribeRSS Feed
momolito
Fluorite | Level 6

Hello 

 

I am trying to match cases to controls randomly in a ratio of 1:5 by sex and age. I have a data set (HAVE - created by PROC SQL step) containing the cases and potential controls, with the variables below; (rand_num - assigned random numbers)

case_id, control_id, case_age, control_age, case_sex, control_sex, rand_num

 

I initially used the code below to select 5 controls for each case.

 

proc sort data=HAVE ;
by case_id rand_num;
run;

 

data WANT not_enough;
set HAVE;
by case_id ;
retain num;
if first.case_id then num=1;
if num le 5 then do;
output WANT;
num=num+1;
end;
if last.case_id then do;
if num le 5 then output not_enough;
end;
run;

 

How can I modify this in order to select 5 controls for each case without replacement (using controls only once)?

Thanks

4 REPLIES 4
rudfaden
Lapis Lazuli | Level 10
Try to look at proc SURVEYSELECT
momolito
Fluorite | Level 6

I have tried to use Proc surveyselect as you suggested with the code below. note - at this point I have 2 dataset, one for CONTROLS (n=>10000) and another for CASES (n=986). each contains the id, age and sex. I am trying to match in a ratio of 1:5 without repeating controls.

 

proc surveyselect data=CONTROLS method=srs out=matched_controls seed=12345 noprint samprate=5;
strata AGE SEX;
id ID;
run;

 

proc sql;
create table matched_pair as
select a.ID as case_id, b.ID as control_id
from cases a, matched_controls b
where a.age between b.age-5 and b.age+5
and a.sex = b.sex
and row_number()OVER(partition by a.Release_No)as <= 5;
quit;

 

However, my outcome still has duplicate controls and is not matching at the 1:5 ratio. How can i modify this to achieve what i want? I am quite new to SAS. Thanks in advance

Patrick
Opal | Level 21

" I have a data set (HAVE - created by PROC SQL step) containing the cases and potential controls"

Please share your have data.

momolito
Fluorite | Level 6

Hello Patrick

 

I have attached my 'Have' dataset (from the SQL step), as well as the cases and controls datasets.

Thanks for the help.

SAS Innovate 2025: Save the Date

 SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!

Save the date!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 4 replies
  • 898 views
  • 1 like
  • 3 in conversation