BookmarkSubscribeRSS Feed
☑ This topic is solved. Need further help from the community? Please sign in and ask a new question.
Haemoglobin17
Obsidian | Level 7

Hi everyone!

 

I am joining two tables that have a variable id in common and each has a variable date (Date and Date2) that are causally related. In particular, Date can determine zero, one or more events in Date2. Here you can see an image of the two tables:

Table1Capture 1.PNG

 

Table2:

Capture 2.PNG

I would like to join these two tables to show, for every event in Date if it is followed by 0,1 or more events of Date2. Clearly every event in Date2 could be attributed to the oldest date, but I would like that every Date2 is attributed to a Date immediately before until there is another successive Date to attribute the Dates2 exactly as, if someone have an allergic reaction, it is more likely that this event is related to what he ate yesterday then what he ate six months before.

 

Capture 3.PNG

 

I also link some fake data as a working data set.

data Database1;
input id $ date :ddmmyy10. ;
cards;
111 01/01/2015
111 20/03/2015
111 15/06/2016
111 12/03/2017
111 30/06/2020
222 18/03/2020
222 25/04/2020
;
run;

data Database2;
input id $ date2 :ddmmyy10. ;
cards;
111 18/06/2016
111 15/08/2016
111 17/03/2017
111 20/01/2018
111 04/04/2018
111 20/04/2020
222 26/05/2020
;
run;

 

Thank you very much! 😁

1 ACCEPTED SOLUTION

Accepted Solutions
PeterClemmensen
Tourmaline | Level 20

Try this

 

data have1;
input id $ date :ddmmyy10.;
format date ddmmyy10.;
cards;
111 01/01/2015
111 20/03/2015
111 15/06/2016
111 12/03/2017
111 30/06/2020
222 18/03/2020
222 25/04/2020
;

data have2;
input id $ date2 :ddmmyy10.;
format date2 ddmmyy10.;
cards;
111 18/06/2016
111 15/08/2016
111 17/03/2017
111 20/01/2018
111 04/04/2018
111 20/04/2020
222 26/05/2020
;

data want(drop = d i);
   
   if _N_ = 1 then do;
      dcl hash h(dataset : 'have2', multidata : 'Y', ordered : 'Y');
      h.definekey('id');
      h.definedata('date2');
      h.definedone();
   end;

   set have1;
   by id; 
   set have1(keep = date rename=(date = d) firstobs = 2) 
       have1(drop = _all_ obs = 1);

   if last.id then d = '31dec9999'd;

   do i = 1 by 1 while (h.do_over() = 0);
      if date < date2 < d then output;

      else if i = 1 and (d = '31dec9999'd | date2 > d) then do;
         date2 = .;
         output;
      end;
   end;

   format date2 ddmmyy10.;

run;

 

Result:

 

id   date        date2
111  01/01/2015  .
111  20/03/2015  .
111  15/06/2016  18/06/2016
111  15/06/2016  15/08/2016
111  12/03/2017  17/03/2017
111  12/03/2017  20/01/2018
111  12/03/2017  04/04/2018
111  12/03/2017  20/04/2020
111  30/06/2020  .
222  18/03/2020  .
222  25/04/2020  26/05/2020

 

View solution in original post

3 REPLIES 3
Shmuel
Garnet | Level 18

Use next tested code:

data Database1;
   input id $ date :ddmmyy10. ;
cards;
111 01/01/2015
111 20/03/2015
111 15/06/2016
111 12/03/2017
111 30/06/2020
222 18/03/2020
222 25/04/2020
;
run;

data Database2;
   input id $ date2 :ddmmyy10. ;
cards;
111 18/06/2016
111 15/08/2016
111 17/03/2017
111 20/01/2018
111 04/04/2018
111 20/04/2020
222 26/05/2020
;
run;

data temp / view=temp ;
 set Database1 Database2;
     sortdt = coalesce(date, date2);
     format date date2 sortdt ddmmyy10.;
run;
proc sort data=temp out=temp1; by id sortdt ; run;

data want;
set temp1;
 by id sortdt ;
    retain dt1;
    if first.id then dt1=date; 
    if date then do;
       dt1=date; 
    end;
    else date=dt1;
    drop sortdt dt1;
run;
data want;
 set want;
  by id date;
     if not (first.date and last.date)
        and date2=. then delete;
run;
PeterClemmensen
Tourmaline | Level 20

Try this

 

data have1;
input id $ date :ddmmyy10.;
format date ddmmyy10.;
cards;
111 01/01/2015
111 20/03/2015
111 15/06/2016
111 12/03/2017
111 30/06/2020
222 18/03/2020
222 25/04/2020
;

data have2;
input id $ date2 :ddmmyy10.;
format date2 ddmmyy10.;
cards;
111 18/06/2016
111 15/08/2016
111 17/03/2017
111 20/01/2018
111 04/04/2018
111 20/04/2020
222 26/05/2020
;

data want(drop = d i);
   
   if _N_ = 1 then do;
      dcl hash h(dataset : 'have2', multidata : 'Y', ordered : 'Y');
      h.definekey('id');
      h.definedata('date2');
      h.definedone();
   end;

   set have1;
   by id; 
   set have1(keep = date rename=(date = d) firstobs = 2) 
       have1(drop = _all_ obs = 1);

   if last.id then d = '31dec9999'd;

   do i = 1 by 1 while (h.do_over() = 0);
      if date < date2 < d then output;

      else if i = 1 and (d = '31dec9999'd | date2 > d) then do;
         date2 = .;
         output;
      end;
   end;

   format date2 ddmmyy10.;

run;

 

Result:

 

id   date        date2
111  01/01/2015  .
111  20/03/2015  .
111  15/06/2016  18/06/2016
111  15/06/2016  15/08/2016
111  12/03/2017  17/03/2017
111  12/03/2017  20/01/2018
111  12/03/2017  04/04/2018
111  12/03/2017  20/04/2020
111  30/06/2020  .
222  18/03/2020  .
222  25/04/2020  26/05/2020

 

Haemoglobin17
Obsidian | Level 7

Hi @PeterClemmensen  and @Shmuel ,

Thank you both for helping me with your wonderful codes. I tried and studied both (this is the reason I didn't answer immediately) as they give different results on my data. Although none of them gives the exact result I was looking for, the code of @PeterClemmensen  is that who is nearer. This is the reason why I choose your answer. I think I will write you a private message to ask you some explanation about your code 😀.

Anyway, Thank you very much to both of you!

SAS Innovate 2025: Register Now

Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 3 replies
  • 665 views
  • 2 likes
  • 3 in conversation