05-14-2018 11:15 AM
I am struggling to join two table without creating duplicate rows using proc sql ( not sure if any other method is more efficient).
Inner join is on: datepart(table1.date)=datepart(table2.date) AND tag=tag AND ID=ID
I think the problem is date and different names in table 1. By just looking that the table its clear that table1's row 1 should be joined with table 2's row 1 because the transaction started at 00:04 in table one and finished at 00:06 in table 2. I issue I am having is I cant join on dates with the timestamp so I am removing timestamps and because of that its creating duplicates.
id tag date amount name_x
1 23 01JUL2018:00:04 12 smith ltd
1 23 01JUL2018:00:09 12 anna smith
id tag ref amount date
1 23 19 12 01JUL2018:00:06:00
1 23 20 12 01JUL2018:00:10:00
id tag date amount name_x ref
1 23 01JUL2018 12 smith ltd 19
1 23 01JUL2018 12 anna smith 20
Appreciate your help.
05-14-2018 11:20 AM
So what is the rule then to join?
You will get duplicates so how do you know which ones to keep and which ones not to keep?
In a simple example like this you could use a MERGE without a BY statement but I suspect that won't generalize to your actual data.
05-14-2018 11:22 AM - edited 05-14-2018 11:34 AM
Timestamp is the rule.. table 1 row1 transaction always occurs before table 2 row1 and should be joined but I am struggling on how to do that.
05-14-2018 11:25 AM
Or another way to phrase it, the datetime in table 2 is the smallest time that is greater than the datetime in table1? Will this hold for other cases? Can you provide several more data points and expected output to ensure the logic is correct?