🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
Quartz | Level 8

## Fuzzy Matching with Dates

Hello,

I would like to fuzzy match dates that are within 5 days of each other, one data set has records that are much earlier. I was thinking I could subtract each date from have1 by each date from have2 to find the best match.

have1

``````data have1;
INFILE DATALINES DLM='#';
input usubjid \$ date mmddyy10. data1_flag;
FORMAT date mmddyy10.;
datalines;
Z26-C64 #2/16/2017 #1
Z26-C64 #3/25/2017 #1
Z26-C64 #6/02/2017 #1
;
run;``````

have2

``````
data have2;
INFILE DATALINES DLM='#';
input usubjid \$ date mmddyy10. data12flag;
FORMAT date mmddyy10.;
datalines;
Z26-C64 #2/02/2017 #1
Z26-C64 #2/15/2017 #1
Z26-C64 #3/25/2017 #1
;
run;``````

want

 usubjid date data1_flag data2_flag Date_have2 Date_have1 Date_Diff Z26-C64 2/2/2017 0 1 2/02/2017 . . Z26-C64 2/15/2017 1 1 2/15/2017 2/16/2017 1 Z26-C64 3/25/2017 1 1 3/25/2017 3/25/2017 0 Z26-C64 6/2/2017 1 0 . 6/2/2017 .

EDIT: I believe I need to clarify that I am trying to do this for each unique_id. The case above is for a single unique_id (defined as usubjid).

1 ACCEPTED SOLUTION

Accepted Solutions
Rhodochrosite | Level 12

## Re: Fuzzy Matching with Dates

I have added another subject and worked with BY Group. Both the datasets are to be sorted for this code to work.

If this does not work for you, better place some data with 2 or more subjects. Show the required output too.

``````data have1;
INFILE DATALINES DLM='#';
input usubjid \$ date mmddyy10. data1_flag;
FORMAT date mmddyy10.;
datalines;
Z26-C64 #2/16/2017 #1
Z26-C64 #3/25/2017 #1
Z26-C64 #6/02/2017 #1
Z28-C65 #6/06/2017 #2
Z28-C65 #7/25/2017 #2
;
run;

data have2;
INFILE DATALINES DLM='#';
input usubjid \$ date mmddyy10. data12flag;
FORMAT date mmddyy10.;
datalines;
Z26-C64 #2/02/2017 #1
Z26-C64 #2/15/2017 #1
Z26-C64 #3/25/2017 #1
Z28-C65 #6/02/2017 #2
Z28-C65 #7/02/2017 #2
Z28-C65 #7/22/2017 #2
;
run;

proc sort data = have1;
by usubjid;
run;

proc sort data = have2;
by usubjid;
run;

data need;
set have1 (rename = (usubjid = id)) end = eof;
do p = 1 to num;
set have2 (rename = (date=date2)) nobs = num point = p;
if usubjid = id then do;
d = abs(date - date2);
put id = usubjid = date = date2 = d =;

if d < 5 then output;
end;
else continue;
end;
put '===========================';
if eof then stop;
run;``````
5 REPLIES 5
Rhodochrosite | Level 12

## Re: Fuzzy Matching with Dates

There are a bunch of SAS macros that have been published to do this.  I did a google search

matching macro sas

and got a bundh.  This one looks like it will do the trick

http://support.sas.com/resources/papers/proceedings10/061-2010.pdf

Doc Muhlbaier

Duke

Quartz | Level 8

## Re: Fuzzy Matching with Dates

Hello Doc_Duke,

This macro is for matching cases to control and does this by individual usubjid's. I would need to match the same usubjid in each dataset and then fuzzy match the dates (each row) to the closest matching date within 5 days.

I will google search for matching macros, thank you for your help!

Rhodochrosite | Level 12

## Re: Fuzzy Matching with Dates

Each record of HAVE1 is compared with each record of HAVE2. Are you interested any difference of 5 days, then uncomment

the statement. Presently, the LOG gives the individual difference.

Hope this is acceptable.

``````data need;
set have1 end = eof;
do p = 1 to num;
set have2 (rename = (date=date2)) nobs = num point = p;

d = abs(date - date2);
put date = date2 = d =;

/*          if d < 5 then output; */
end;
if eof then stop;
run;

The LOG:

date=02/16/2017 date2=02/02/2017 d=14
date=02/16/2017 date2=02/15/2017 d=1
date=02/16/2017 date2=03/25/2017 d=37
date=03/25/2017 date2=02/02/2017 d=51
date=03/25/2017 date2=02/15/2017 d=38
date=03/25/2017 date2=03/25/2017 d=0
date=06/02/2017 date2=02/02/2017 d=120
date=06/02/2017 date2=02/15/2017 d=107
date=06/02/2017 date2=03/25/2017 d=69``````
Quartz | Level 8

## Re: Fuzzy Matching with Dates

Hello Datasp,

This is close to what I am looking for, I am trying to tweak it so it does by for matching unique ids between the two datasets (unique_Id = usubjid). If you have any hints that would be helpful!

Best
Rhodochrosite | Level 12

## Re: Fuzzy Matching with Dates

I have added another subject and worked with BY Group. Both the datasets are to be sorted for this code to work.

If this does not work for you, better place some data with 2 or more subjects. Show the required output too.

``````data have1;
INFILE DATALINES DLM='#';
input usubjid \$ date mmddyy10. data1_flag;
FORMAT date mmddyy10.;
datalines;
Z26-C64 #2/16/2017 #1
Z26-C64 #3/25/2017 #1
Z26-C64 #6/02/2017 #1
Z28-C65 #6/06/2017 #2
Z28-C65 #7/25/2017 #2
;
run;

data have2;
INFILE DATALINES DLM='#';
input usubjid \$ date mmddyy10. data12flag;
FORMAT date mmddyy10.;
datalines;
Z26-C64 #2/02/2017 #1
Z26-C64 #2/15/2017 #1
Z26-C64 #3/25/2017 #1
Z28-C65 #6/02/2017 #2
Z28-C65 #7/02/2017 #2
Z28-C65 #7/22/2017 #2
;
run;

proc sort data = have1;
by usubjid;
run;

proc sort data = have2;
by usubjid;
run;

data need;
set have1 (rename = (usubjid = id)) end = eof;
do p = 1 to num;
set have2 (rename = (date=date2)) nobs = num point = p;
if usubjid = id then do;
d = abs(date - date2);
put id = usubjid = date = date2 = d =;

if d < 5 then output;
end;
else continue;
end;
put '===========================';
if eof then stop;
run;``````
Discussion stats
• 5 replies
• 2878 views
• 2 likes
• 3 in conversation