I have a file with cases and controls (combined file is 15m). I would like to select 2 controls without replacement for each case such that the date for cases is less than the dates for controls. Some controls have missing dates but they should be included for control selection. A sample dataset is attached here.
data have;
input id case DateInt :yymmdd. @@;
format dateInt date11.;
datalines;
1 0 . 2 0 20120103 3 1 20120101
4 0 20120103 5 0 20120101 6 1 20120103
7 0 20120101 8 0 . 9 0 20120103
10 1 20120105 11 0 20120103 12 0 20120103
13 0 20120103 14 0 20120106 15 0 20120107
;
That is the control selection should be as follows-
(1) For case id 3, the 2 selections without replacement should be from ids 1,2,4,8,9,11,12,13,14,15.
(2) For case id 6, the 2 selections without replacement should be from ids 1,8,10,14,15 and excluding selections in (1) above.
(3) For case id 10, the 2 selections without replacement should be from ids 1,8,14,15 and excluding selections in (1) and (2) above.
Thanks for any tips. I am struggling to write a good code.
... View more