Hello,
I would like to fuzzy match dates that are within 5 days of each other, one data set has records that are much earlier. I was thinking I could subtract each date from have1 by each date from have2 to find the best match.
have1
data have1;
INFILE DATALINES DLM='#';
input usubjid $ date mmddyy10. data1_flag;
FORMAT date mmddyy10.;
datalines;
Z26-C64 #2/16/2017 #1
Z26-C64 #3/25/2017 #1
Z26-C64 #6/02/2017 #1
;
run;
have2
data have2;
INFILE DATALINES DLM='#';
input usubjid $ date mmddyy10. data12flag;
FORMAT date mmddyy10.;
datalines;
Z26-C64 #2/02/2017 #1
Z26-C64 #2/15/2017 #1
Z26-C64 #3/25/2017 #1
;
run;
want
usubjid | date | data1_flag | data2_flag | Date_have2 | Date_have1 | Date_Diff |
Z26-C64 | 2/2/2017 | 0 | 1 | 2/02/2017 | . | . |
Z26-C64 | 2/15/2017 | 1 | 1 | 2/15/2017 | 2/16/2017 | 1 |
Z26-C64 | 3/25/2017 | 1 | 1 | 3/25/2017 | 3/25/2017 | 0 |
Z26-C64 | 6/2/2017 | 1 | 0 | . | 6/2/2017 | . |
EDIT: I believe I need to clarify that I am trying to do this for each unique_id. The case above is for a single unique_id (defined as usubjid).
I have added another subject and worked with BY Group. Both the datasets are to be sorted for this code to work.
If this does not work for you, better place some data with 2 or more subjects. Show the required output too.
data have1;
INFILE DATALINES DLM='#';
input usubjid $ date mmddyy10. data1_flag;
FORMAT date mmddyy10.;
datalines;
Z26-C64 #2/16/2017 #1
Z26-C64 #3/25/2017 #1
Z26-C64 #6/02/2017 #1
Z28-C65 #6/06/2017 #2
Z28-C65 #7/25/2017 #2
;
run;
data have2;
INFILE DATALINES DLM='#';
input usubjid $ date mmddyy10. data12flag;
FORMAT date mmddyy10.;
datalines;
Z26-C64 #2/02/2017 #1
Z26-C64 #2/15/2017 #1
Z26-C64 #3/25/2017 #1
Z28-C65 #6/02/2017 #2
Z28-C65 #7/02/2017 #2
Z28-C65 #7/22/2017 #2
;
run;
proc sort data = have1;
by usubjid;
run;
proc sort data = have2;
by usubjid;
run;
data need;
set have1 (rename = (usubjid = id)) end = eof;
do p = 1 to num;
set have2 (rename = (date=date2)) nobs = num point = p;
if usubjid = id then do;
d = abs(date - date2);
put id = usubjid = date = date2 = d =;
if d < 5 then output;
end;
else continue;
end;
put '===========================';
if eof then stop;
run;
There are a bunch of SAS macros that have been published to do this. I did a google search
matching macro sas
and got a bundh. This one looks like it will do the trick
http://support.sas.com/resources/papers/proceedings10/061-2010.pdf
Doc Muhlbaier
Duke
Hello Doc_Duke,
This macro is for matching cases to control and does this by individual usubjid's. I would need to match the same usubjid in each dataset and then fuzzy match the dates (each row) to the closest matching date within 5 days.
I will google search for matching macros, thank you for your help!
Each record of HAVE1 is compared with each record of HAVE2. Are you interested any difference of 5 days, then uncomment
the statement. Presently, the LOG gives the individual difference.
Hope this is acceptable.
data need;
set have1 end = eof;
do p = 1 to num;
set have2 (rename = (date=date2)) nobs = num point = p;
d = abs(date - date2);
put date = date2 = d =;
/* if d < 5 then output; */
end;
if eof then stop;
run;
The LOG:
date=02/16/2017 date2=02/02/2017 d=14
date=02/16/2017 date2=02/15/2017 d=1
date=02/16/2017 date2=03/25/2017 d=37
date=03/25/2017 date2=02/02/2017 d=51
date=03/25/2017 date2=02/15/2017 d=38
date=03/25/2017 date2=03/25/2017 d=0
date=06/02/2017 date2=02/02/2017 d=120
date=06/02/2017 date2=02/15/2017 d=107
date=06/02/2017 date2=03/25/2017 d=69
I have added another subject and worked with BY Group. Both the datasets are to be sorted for this code to work.
If this does not work for you, better place some data with 2 or more subjects. Show the required output too.
data have1;
INFILE DATALINES DLM='#';
input usubjid $ date mmddyy10. data1_flag;
FORMAT date mmddyy10.;
datalines;
Z26-C64 #2/16/2017 #1
Z26-C64 #3/25/2017 #1
Z26-C64 #6/02/2017 #1
Z28-C65 #6/06/2017 #2
Z28-C65 #7/25/2017 #2
;
run;
data have2;
INFILE DATALINES DLM='#';
input usubjid $ date mmddyy10. data12flag;
FORMAT date mmddyy10.;
datalines;
Z26-C64 #2/02/2017 #1
Z26-C64 #2/15/2017 #1
Z26-C64 #3/25/2017 #1
Z28-C65 #6/02/2017 #2
Z28-C65 #7/02/2017 #2
Z28-C65 #7/22/2017 #2
;
run;
proc sort data = have1;
by usubjid;
run;
proc sort data = have2;
by usubjid;
run;
data need;
set have1 (rename = (usubjid = id)) end = eof;
do p = 1 to num;
set have2 (rename = (date=date2)) nobs = num point = p;
if usubjid = id then do;
d = abs(date - date2);
put id = usubjid = date = date2 = d =;
if d < 5 then output;
end;
else continue;
end;
put '===========================';
if eof then stop;
run;
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.