BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
daszlosek
Quartz | Level 8

Hello,

 

I would like to fuzzy match dates that are within 5 days of each other, one data set has records that are much earlier. I was thinking I could subtract each date from have1 by each date from have2 to find the best match. 

 

have1

data have1;
    INFILE DATALINES DLM='#';
    input usubjid $ date mmddyy10. data1_flag;
	FORMAT date mmddyy10.;
    datalines;
Z26-C64 #2/16/2017 #1
Z26-C64 #3/25/2017 #1
Z26-C64 #6/02/2017 #1
;
run;

 


have2


data have2;
    INFILE DATALINES DLM='#';
    input usubjid $ date mmddyy10. data12flag;
	FORMAT date mmddyy10.;
    datalines;
Z26-C64 #2/02/2017 #1
Z26-C64 #2/15/2017 #1
Z26-C64 #3/25/2017 #1
;
run;


want

usubjid date data1_flag data2_flag Date_have2 Date_have1 Date_Diff
Z26-C64 2/2/2017 0 1 2/02/2017 . .
Z26-C64 2/15/2017 1 1 2/15/2017 2/16/2017 1
Z26-C64 3/25/2017 1 1 3/25/2017 3/25/2017 0
Z26-C64 6/2/2017 1 0 . 6/2/2017 .

 

 

EDIT: I believe I need to clarify that I am trying to do this for each unique_id. The case above is for a single unique_id (defined as usubjid).

1 ACCEPTED SOLUTION

Accepted Solutions
KachiM
Rhodochrosite | Level 12

I have added another subject and worked with BY Group. Both the datasets are to be sorted for this code to work.

If this does not work for you, better place some data with 2 or more subjects. Show the required output too.

data have1;
    INFILE DATALINES DLM='#';
    input usubjid $ date mmddyy10. data1_flag;
   FORMAT date mmddyy10.;
    datalines;
Z26-C64 #2/16/2017 #1
Z26-C64 #3/25/2017 #1
Z26-C64 #6/02/2017 #1
Z28-C65 #6/06/2017 #2
Z28-C65 #7/25/2017 #2
;
run;
 


data have2;
    INFILE DATALINES DLM='#';
    input usubjid $ date mmddyy10. data12flag;
   FORMAT date mmddyy10.;
    datalines;
Z26-C64 #2/02/2017 #1
Z26-C64 #2/15/2017 #1
Z26-C64 #3/25/2017 #1
Z28-C65 #6/02/2017 #2
Z28-C65 #7/02/2017 #2
Z28-C65 #7/22/2017 #2
;
run;

proc sort data = have1;
by usubjid;
run;

proc sort data = have2;
by usubjid;
run;

data need;
   set have1 (rename = (usubjid = id)) end = eof;
      do p = 1 to num;
         set have2 (rename = (date=date2)) nobs = num point = p;
         if usubjid = id then do;
            d = abs(date - date2);
            put id = usubjid = date = date2 = d =;

         if d < 5 then output;
         end;
         else continue;
      end;
      put '===========================';
   if eof then stop;
run;

View solution in original post

5 REPLIES 5
Doc_Duke
Rhodochrosite | Level 12

There are a bunch of SAS macros that have been published to do this.  I did a google search

matching macro sas

and got a bundh.  This one looks like it will do the trick

http://support.sas.com/resources/papers/proceedings10/061-2010.pdf

 

Doc Muhlbaier

Duke

daszlosek
Quartz | Level 8

Hello Doc_Duke,

 

This macro is for matching cases to control and does this by individual usubjid's. I would need to match the same usubjid in each dataset and then fuzzy match the dates (each row) to the closest matching date within 5 days.

 

I will google search for matching macros, thank you for your help!

 

 

KachiM
Rhodochrosite | Level 12

Each record of HAVE1 is compared with each record of HAVE2. Are you interested any difference of 5 days, then uncomment

the statement. Presently, the LOG gives the individual difference.

Hope this is acceptable.

data need;
   set have1 end = eof;
      do p = 1 to num;
         set have2 (rename = (date=date2)) nobs = num point = p;

         d = abs(date - date2);
         put date = date2 = d =;

/*          if d < 5 then output; */
      end;
   if eof then stop;
run;

The LOG:

date=02/16/2017 date2=02/02/2017 d=14
 date=02/16/2017 date2=02/15/2017 d=1
 date=02/16/2017 date2=03/25/2017 d=37
 date=03/25/2017 date2=02/02/2017 d=51
 date=03/25/2017 date2=02/15/2017 d=38
 date=03/25/2017 date2=03/25/2017 d=0
 date=06/02/2017 date2=02/02/2017 d=120
 date=06/02/2017 date2=02/15/2017 d=107
 date=06/02/2017 date2=03/25/2017 d=69
daszlosek
Quartz | Level 8
Hello Datasp,

This is close to what I am looking for, I am trying to tweak it so it does by for matching unique ids between the two datasets (unique_Id = usubjid). If you have any hints that would be helpful!

Best
KachiM
Rhodochrosite | Level 12

I have added another subject and worked with BY Group. Both the datasets are to be sorted for this code to work.

If this does not work for you, better place some data with 2 or more subjects. Show the required output too.

data have1;
    INFILE DATALINES DLM='#';
    input usubjid $ date mmddyy10. data1_flag;
   FORMAT date mmddyy10.;
    datalines;
Z26-C64 #2/16/2017 #1
Z26-C64 #3/25/2017 #1
Z26-C64 #6/02/2017 #1
Z28-C65 #6/06/2017 #2
Z28-C65 #7/25/2017 #2
;
run;
 


data have2;
    INFILE DATALINES DLM='#';
    input usubjid $ date mmddyy10. data12flag;
   FORMAT date mmddyy10.;
    datalines;
Z26-C64 #2/02/2017 #1
Z26-C64 #2/15/2017 #1
Z26-C64 #3/25/2017 #1
Z28-C65 #6/02/2017 #2
Z28-C65 #7/02/2017 #2
Z28-C65 #7/22/2017 #2
;
run;

proc sort data = have1;
by usubjid;
run;

proc sort data = have2;
by usubjid;
run;

data need;
   set have1 (rename = (usubjid = id)) end = eof;
      do p = 1 to num;
         set have2 (rename = (date=date2)) nobs = num point = p;
         if usubjid = id then do;
            d = abs(date - date2);
            put id = usubjid = date = date2 = d =;

         if d < 5 then output;
         end;
         else continue;
      end;
      put '===========================';
   if eof then stop;
run;

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 5 replies
  • 2861 views
  • 2 likes
  • 3 in conversation