Hello,
I have to check matching criteria on the following variables: age (+/- 3 years) gender dg_date (+/- 1 month).
How can I do?
Thanks
See example of data at below
data match; input Id case_control match_caseid age gender dg_date : ddmmyy10. ; format dg_date ddmmyy10.; datalines; A1 Case 20 F 20/08/2023 A2 Control A1 23 F 21/07/2023 A3 Control A1 22 F 22/09/2023 B1 Case 30 M 01/06/2023 B2 Control B1 35 M 24/08/2023 B3 Control B1 33 M 12/06/2023 C1 Case 40 M 26/05/2023 C2 Control C1 38 M 30/04/2023 D1 Case 47 M 28/07/2023 D2 Control D1 41 M 29/06/2023 D3 Control D1 50 F 30/07/2023 D4 Control D1 63 M 31/07/2023 ; run;
If all CASEs precede the relevant CONTROL candidates, then
data match;
input Id $ case_control $ match_caseid $ age gender $ dg_date : ddmmyy10. ;
format dg_date ddmmyy10.;
datalines;
A1 Case A1 20 F 20/08/2023
A2 Control A1 23 F 21/07/2023
A3 Control A1 22 F 22/09/2023
B1 Case B1 30 M 1/06/2023
B2 Control B1 35 M 24/08/2023
B3 Control B1 33 M 12/06/2023
C1 Case C1 40 M 26/05/2023
C2 Control C1 38 M 30/04/2023
D1 Case D1 47 M 28/07/2023
D2 Control D1 41 M 29/06/2023
D3 Control D1 50 F 30/07/2023
D4 Control D1 63 M 31/07/2023
run;
data want (drop=_:);
set match
match (obs=0 rename=(age=_age gender=_gender dg_date=_dg_date));
if _n_=1 then do;
declare hash h();
h.definekey('id');
h.definedata('_age','_gender','_dg_date');
h.definedone();
end;
if case_control='Case' then do;
_age=age;
_gender=gender;
_dg_date=dg_date;
h.add();
end;
else do;
_rc=h.find(key:match_caseid);
if _rc^=0 then Match='Match_Caseid Not Found';
else if gender=_gender and abs(age-_age)<=3
and dg_date > intnx('month',_dg_date,-1,'sameday')
and dg_date <= intnx('month',_dg_date,+1,'sameday')
then Match='Yes';
else Match='No ';
end;
run;
Edit: minor change made to the "else if .... then Match='Yes'" statement to incorporate revised interpretation of date ranges.
Suggest that you run that data step and examine the results. It throws a lot of invalid data messages because of missing values.
In datalines place . where a value is missing and it may help.
I have no idea what you expect for a result. I think you need to expand a bit on exactly what you are checking for.
age +/- 3 years from what?
What check is to do be done for gender?
Dg_date +/- month from what?
I want to check if age (+/- 3 years), gender and dg_date (+/- 1 month) are the same for case and matched control
Data step corrected
data match; input Id $ case_control $ match_caseid $ age gender $ dg_date : ddmmyy10. ; format dg_date ddmmyy10.; datalines; A1 Case A1 20 F 20/08/2023 A2 Control A1 23 F 21/07/2023 A3 Control A1 22 F 22/09/2023 B1 Case B1 30 M 1/06/2023 B2 Control B1 35 M 24/08/2023 B3 Control B1 33 M 12/06/2023 C1 Case C1 40 M 26/05/2023 C2 Control C1 38 M 30/04/2023 D1 Case D1 47 M 28/07/2023 D2 Control D1 41 M 29/06/2023 D3 Control D1 50 F 30/07/2023 D4 Control D1 63 M 31/07/2023 ; run;
And what does (+/- 1 month) for dg_date mean. Does it mean calendar months two month apart?
I.e.
are 01jun2023 and 30apr2023 two months apart?
or would it have to be some other criterion?
If all CASEs precede the relevant CONTROL candidates, then
data match;
input Id $ case_control $ match_caseid $ age gender $ dg_date : ddmmyy10. ;
format dg_date ddmmyy10.;
datalines;
A1 Case A1 20 F 20/08/2023
A2 Control A1 23 F 21/07/2023
A3 Control A1 22 F 22/09/2023
B1 Case B1 30 M 1/06/2023
B2 Control B1 35 M 24/08/2023
B3 Control B1 33 M 12/06/2023
C1 Case C1 40 M 26/05/2023
C2 Control C1 38 M 30/04/2023
D1 Case D1 47 M 28/07/2023
D2 Control D1 41 M 29/06/2023
D3 Control D1 50 F 30/07/2023
D4 Control D1 63 M 31/07/2023
run;
data want (drop=_:);
set match
match (obs=0 rename=(age=_age gender=_gender dg_date=_dg_date));
if _n_=1 then do;
declare hash h();
h.definekey('id');
h.definedata('_age','_gender','_dg_date');
h.definedone();
end;
if case_control='Case' then do;
_age=age;
_gender=gender;
_dg_date=dg_date;
h.add();
end;
else do;
_rc=h.find(key:match_caseid);
if _rc^=0 then Match='Match_Caseid Not Found';
else if gender=_gender and abs(age-_age)<=3
and dg_date > intnx('month',_dg_date,-1,'sameday')
and dg_date <= intnx('month',_dg_date,+1,'sameday')
then Match='Yes';
else Match='No ';
end;
run;
Edit: minor change made to the "else if .... then Match='Yes'" statement to incorporate revised interpretation of date ranges.
Thanks, it work
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.