## Case-control 1-4 matching by date

Solved
Frequent Contributor
Posts: 138

# Case-control 1-4 matching by date

Hello,

I have a dataset composed of 1.5 million subjects which are divided into cases and controls. I need SAS coding to create a new dataset composed of all the cases plus four randomly selected controls for each case, matched by date interviewed.

Any help is appreciated and I am also available to provide more details if needed.

Thanks!

Accepted Solutions
Solution
‎12-26-2012 10:17 PM
Posts: 5,543

## Re: Case-control 1-4 matching by date

You can match cases and 4 controls that were interviewed on the same day (dateInt) randomly this way:

data have;
input id case DateInt :yymmdd. @@;
format dateInt date11.;
datalines;
1  0 20120101  2 0 20120103  3 1 20120101
4  0 20120103  5 0 20120101  6 1 20120103
7  0 20120101  8 0 20120102  9 0 20120103
10 1 20120103 11 0 20120103 12 0 20120103
13 0 20120103 14 0 20120103 15 0 20120103
;

/* Assign a random order to all cases and controls */
data tmp;
call streaminit(98768976);
set have;
rnd = rand("UNIFORM");
run;

proc sort data=tmp; by dateInt case rnd; run;

proc rank data=tmp out=tmpR;
by dateInt case;
var rnd;
run;

/* Match the first case with the first 4 controls, the second case with controls

5 to 8 and so on, within each date. */
proc sql;
create table want as
select ca.dateInt, ca.id as caseId, co.id as controlId
from tmpR as ca inner join tmpR as co
on ca.dateInt=co.dateInt and ca.rnd=ceil(co.rnd/4)
where ca.case and not co.case;
drop table tmp, tmpR;
select * from want;
quit;

PG

PG

All Replies
Solution
‎12-26-2012 10:17 PM
Posts: 5,543

## Re: Case-control 1-4 matching by date

You can match cases and 4 controls that were interviewed on the same day (dateInt) randomly this way:

data have;
input id case DateInt :yymmdd. @@;
format dateInt date11.;
datalines;
1  0 20120101  2 0 20120103  3 1 20120101
4  0 20120103  5 0 20120101  6 1 20120103
7  0 20120101  8 0 20120102  9 0 20120103
10 1 20120103 11 0 20120103 12 0 20120103
13 0 20120103 14 0 20120103 15 0 20120103
;

/* Assign a random order to all cases and controls */
data tmp;
call streaminit(98768976);
set have;
rnd = rand("UNIFORM");
run;

proc sort data=tmp; by dateInt case rnd; run;

proc rank data=tmp out=tmpR;
by dateInt case;
var rnd;
run;

/* Match the first case with the first 4 controls, the second case with controls

5 to 8 and so on, within each date. */
proc sql;
create table want as
select ca.dateInt, ca.id as caseId, co.id as controlId
from tmpR as ca inner join tmpR as co
on ca.dateInt=co.dateInt and ca.rnd=ceil(co.rnd/4)
where ca.case and not co.case;
drop table tmp, tmpR;
select * from want;
quit;

PG

PG
Posts: 2,125

## Re: Case-control 1-4 matching by date

PG's approach works for exact date matches.  If you want to look for dates 'near' the case-date, do a google search on

nearest neighbor match site:sas.com

for a bunch of related results.

Frequent Contributor
Posts: 138

## Re: Case-control 1-4 matching by date

Thank you for the reply. Upon entering the code, I received the following error message: "Column case could not be found in the table/view identified with the correlation name CA" (and another one for the correlation name CO). As I am not conversant in proc sql I am unable to troubleshoot myself and any assistance would be appreciated.

Thanks!

Posts: 5,543