I have a dataset of cases (N=681) and I am trying to select a 1:2 cohort of matched controls from a dataset of ~20,000 eligible patients.
Both datasets contain the variables:
patient_id
surgery_date
age
I first want to match based on surgery date, selecting all eligible controls who had a surgery within 30 days (plus or minus) from the date each case had surgery.
Then, among those patients with a matched surgery date, I would like to select the two patients who were closest in age to the case (plus or minus) to serve as its controls. If there are multiple patients equally close in age to the case, I would like to use simple random sampling to select the controls from among them.
No control should be selected more than once; so although a patient may be eligible to serve as a control for more than one case, once a patient is selected as a control they need to be removed from the pool of eligible controls.
Can anyone help me write the syntax to select this matched cohort? I'm having quite a hard time figuring it out!!
Thanks so much!