BookmarkSubscribeRSS Feed
momolito
Fluorite | Level 6

Hello 

 

I am trying to match cases to controls randomly in a ratio of 1:5 by sex and age. I have a data set (HAVE - created by PROC SQL step) containing the cases and potential controls, with the variables below; (rand_num - assigned random numbers)

case_id, control_id, case_age, control_age, case_sex, control_sex, rand_num

 

I initially used the code below to select 5 controls for each case.

 

proc sort data=HAVE ;
by case_id rand_num;
run;

 

data WANT not_enough;
set HAVE;
by case_id ;
retain num;
if first.case_id then num=1;
if num le 5 then do;
output WANT;
num=num+1;
end;
if last.case_id then do;
if num le 5 then output not_enough;
end;
run;

 

How can I modify this in order to select 5 controls for each case without replacement (using controls only once)?

Thanks

4 REPLIES 4
rudfaden
Lapis Lazuli | Level 10
Try to look at proc SURVEYSELECT
momolito
Fluorite | Level 6

I have tried to use Proc surveyselect as you suggested with the code below. note - at this point I have 2 dataset, one for CONTROLS (n=>10000) and another for CASES (n=986). each contains the id, age and sex. I am trying to match in a ratio of 1:5 without repeating controls.

 

proc surveyselect data=CONTROLS method=srs out=matched_controls seed=12345 noprint samprate=5;
strata AGE SEX;
id ID;
run;

 

proc sql;
create table matched_pair as
select a.ID as case_id, b.ID as control_id
from cases a, matched_controls b
where a.age between b.age-5 and b.age+5
and a.sex = b.sex
and row_number()OVER(partition by a.Release_No)as <= 5;
quit;

 

However, my outcome still has duplicate controls and is not matching at the 1:5 ratio. How can i modify this to achieve what i want? I am quite new to SAS. Thanks in advance

Patrick
Opal | Level 21

" I have a data set (HAVE - created by PROC SQL step) containing the cases and potential controls"

Please share your have data.

momolito
Fluorite | Level 6

Hello Patrick

 

I have attached my 'Have' dataset (from the SQL step), as well as the cases and controls datasets.

Thanks for the help.

sas-innovate-white.png

Our biggest data and AI event of the year.

Don’t miss the livestream kicking off May 7. It’s free. It’s easy. And it’s the best seat in the house.

Join us virtually with our complimentary SAS Innovate Digital Pass. Watch live or on-demand in multiple languages, with translations available to help you get the most out of every session.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 4 replies
  • 1339 views
  • 1 like
  • 3 in conversation