## Matching

Occasional Contributor
Posts: 12

# Matching

Hello,

I am carrying out a case - controls study. I would like to match 1 control for each case, and the matching is on a continuous variable (age) which is supposed to be +/- 1 year. I would like to have the control to be used only once, rather than being selected more than once for different cases.

I would appreciate any help on this.

Thanks
Hani
Valued Guide
Posts: 653

## Re: Matching

Can you show us a couple of simple data sets that illustrate the problem and the hoped for result? It might help us understand what you are trying to achieve.
Posts: 2,125

## Re: Matching

Check out

http://www2.sas.com/proceedings/sugi29/173-29.pdf

I searched support.sas.com for
case control matching
This is a common problem, so there is no need to reinvent the wheel.

Doc Muhlbaier
Duke
Occasional Contributor
Posts: 12

## Re: Matching

Hello,

Thanks for the replies.

Basically, I have a group of cases, and another group of controls.
I need to match 1 control for each case on 1 continuous variables (LOS), where the matching criteria is +/- 1.

For instance:

case subject A, LOS=1.5,
case subject B, LOS=2.0,

control subject a, LOS=1.7,
control subject b, LOS=1.9,

In this case, control subject a is a potential controls for both cases A and B, since the LOS is within the 1 year of the case. Similarly, control subject b is a potential controls for both case subjects A and B.. So, what I need is to select a random control for case subject A, and this control should not be replaced in the pool, and thus not to be considered again for case subject B.

Based on the recommended paper published by Hugh Kawabata, I have already used this paper, and following is the program I usedfrom this paper.

I know there is something wrong in the coding since very few controls are being selected for all of the cases.

Finally, I would appreciate your help in identifying the problem I have

Thanks you very much

Hani

data study control;
set matching;
rand_num=uniform(0);
if case=1 then output study;
else output control;
run;

data control2;
set control;
LOS_low=LOS-1;
LOS_high=LOS+1;
run;

proc sql;
create table controls_id as
select
one.studyid as study_id,
two.studyid as control_id,
one.LOS as study_LOS,
two.LOS as control_LOS,
one.rand_num as rand_num
from study one, control2 two
where (one.LOS between two.LOS_low and two.LOS_high);
quit;

* count the number of control subjects for each case subject;
proc sort data=controls_id ;
by study_id ;
run;

data controls_id2(keep=study_id num_controls);
set controls_id;
by study_id;
retain num_controls;
if first.study_id then num_controls=1;
else num_controls=num_controls+1;
if last.study_id then output;
run;

* now merge the counts back into the dataset;
data controls_id3;
merge controls_id
controls_id2;
by study_id;
run;

* now order the rows to select the first matching control;
proc sort data=controls_id3;
by control_id num_controls rand_num;
run;
data controls_id4;
set controls_id3;
by control_id;
if first.control_id;
run;

*Now, as before, randomly select the fixed number (in our example, two) of control subjects for each case.;
proc sort data=controls_id ;
by study_id rand_num;
run;
data controls_id2 not_enough;
set controls_id;
by study_id ;
retain num;
if first.study_id then num=1;
if num le 2 then do;
output controls_id2;
num=num+1;
end;
if last.study_id then do;
if num le 2 then output not_enough;
end;
run;
proc print data=controls_id2(obs=40);
title2 'matched patients';
run;
Discussion stats
• 3 replies
• 295 views
• 0 likes
• 3 in conversation