BookmarkSubscribeRSS Feed
anweinbe
Quartz | Level 8

I have one table that shows company announcements. These could be any date at random.

I have another table that shows many different financials about the company from specific dates.

 

Is there a way to add a clause to the join where it will join the first table to the second by joining on the next closest date in the 2nd table?

 

For example, lets assume my announcement date was 5/5/2020.

 

Lets assume that in the next table I had dates of 3/3/2020, 4/27/2020, 5/27/2020, and 6/4/2020.

 

I would like to join these tables on the company name (which is both tables), but I want the join to keep the data from the 5/5/2020 announcement and all the data from the 5/27/2020 record (since its the next date).

 

Thank you!

5 REPLIES 5
Patrick
Opal | Level 21

With SQL only you would first join the two tables over company and then filter the intermediary result to pick the row where the difference between the two dates is minimal.

 

First join

select t1.cust_id, t1.date, t2.date

join table1 t1 left join table2 t2

on t1.cust_id=t2.cust_id   ...may be here also something like: and t2.date between t1.date-20 and t1.date+20 to not select all matching rows from t2

 

Then filter

group by t1.cust_id

having abs(t1.date-t2.date)=min(abs(t1.date-t2.date))

....now you could still end-up with two rows for you customer if there are two rows in t2 where you have dates which are exactly the same amount of days before or after the t1 date. So you will have to extend the filter to pick either the t2.date before or after the t1.date for such a case.

Kurt_Bremser
Super User
proc sql;
create table want (drop=dif) as
  select
    a.company,
    a.somedate,
    b.otherdate,
    abs(a.somedate - b.otherdate) as dif
  from have1 a, have2 b
  where a.company = b.company
  group by a.company, a.somedate
  having dif = min(dif)
;
quit;

 

(posted from my tablet, so untested) 

anweinbe
Quartz | Level 8

Kurt,

This gets me very very close to where I need to be.

The issue I am running into is that in the second file there appears to be duplicate dates, so each of the items in my original table ends up matching to the date in the 2nd table 4-5 times each.

 

Do you suggest that I just take the resulting table from your code and then just remove duplicates on the resulting file?

ballardw
Super User

Is there any rule involved for if two of the "other" dates have the same difference from the base date?

Or if 3 or more records? ( possible if non-unique values of company identification or matching characteristic in the second set for some reason)

Ready to join fellow brilliant minds for the SAS Hackathon?

Build your skills. Make connections. Enjoy creative freedom. Maybe change the world. Registration is now open through August 30th. Visit the SAS Hackathon homepage.

Register today!
How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 5 replies
  • 665 views
  • 1 like
  • 4 in conversation