BookmarkSubscribeRSS Feed
momolito
Fluorite | Level 6

Hello 

 

I am trying to match cases to controls randomly in a ratio of 1:5 by sex and age. I have a data set (HAVE - created by PROC SQL step) containing the cases and potential controls, with the variables below; (rand_num - assigned random numbers)

case_id, control_id, case_age, control_age, case_sex, control_sex, rand_num

 

I initially used the code below to select 5 controls for each case.

 

proc sort data=HAVE ;
by case_id rand_num;
run;

 

data WANT not_enough;
set HAVE;
by case_id ;
retain num;
if first.case_id then num=1;
if num le 5 then do;
output WANT;
num=num+1;
end;
if last.case_id then do;
if num le 5 then output not_enough;
end;
run;

 

How can I modify this in order to select 5 controls for each case without replacement (using controls only once)?

Thanks

4 REPLIES 4
rudfaden
Pyrite | Level 9
Try to look at proc SURVEYSELECT
momolito
Fluorite | Level 6

I have tried to use Proc surveyselect as you suggested with the code below. note - at this point I have 2 dataset, one for CONTROLS (n=>10000) and another for CASES (n=986). each contains the id, age and sex. I am trying to match in a ratio of 1:5 without repeating controls.

 

proc surveyselect data=CONTROLS method=srs out=matched_controls seed=12345 noprint samprate=5;
strata AGE SEX;
id ID;
run;

 

proc sql;
create table matched_pair as
select a.ID as case_id, b.ID as control_id
from cases a, matched_controls b
where a.age between b.age-5 and b.age+5
and a.sex = b.sex
and row_number()OVER(partition by a.Release_No)as <= 5;
quit;

 

However, my outcome still has duplicate controls and is not matching at the 1:5 ratio. How can i modify this to achieve what i want? I am quite new to SAS. Thanks in advance

Patrick
Opal | Level 21

" I have a data set (HAVE - created by PROC SQL step) containing the cases and potential controls"

Please share your have data.

momolito
Fluorite | Level 6

Hello Patrick

 

I have attached my 'Have' dataset (from the SQL step), as well as the cases and controls datasets.

Thanks for the help.

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 4 replies
  • 556 views
  • 1 like
  • 3 in conversation