BookmarkSubscribeRSS Feed
Calcite | Level 5 ysk
Calcite | Level 5

Hi SAS community,


I have somewhat complex database where I'm trying to link Database A to B where A is my primary database.


I'm trying to find the last date from B [B_date] that is associated with date from A. [A_date]


For example,


Patient A has consecutive visits (Jan 1st, Jan 2nd, Jan 3rd and so on) and I want to merge with Database B and show the last B_date. (i.e. minimum # of difference between A_date and B_date). Since there are multiple visits in B, it's showing different B_date for each of the A_date. I'm getting results like below,


Patient A |  A_date1  | Last_B_Date1

Patient A |  A_date2  | Last_B_Date2

Patient A |  A_date3  | Last_B_Date3

Patient A |  A_date4  | Last_B_Date4


I want the results to show like 

Patient A |  A_date1  | Last_B_Date1

Patient A |  A_date2  | Last_B_Date1

Patient A |  A_date3  | Last_B_Date1

Patient A |  A_date4  | Last_B_Date1


Hope this makes sense.




Super User

How are you doing your merge?


Is there an order to the date, or something to identify the first record beyond being first, ie is it the smallest, largest?


If via SQL you could consider using a function such as Min() to obtain the value.

If via data step a BY processing group could identify the first record and retain the date.

Calcite | Level 5 ysk
Calcite | Level 5
forgot to mention that the condition I use is that b_date is earlier than a_date and I just want to extract the last date that meets this criteria.
I'm using left join.
Super User

Are you working for some CRO company and do some SDTM job ?

Post some dummy data and the output you need , that would be better to let others know what you want.

Diamond | Level 26 RW9
Diamond | Level 26

Well, on your logic alone something like the below should work.  However it is a good idea to post test data - in the form of a datastep - and required output for a more complete answer:

proc sql;
  create table WANT as
  select  MAIN_DATASET.*,
          (select max(DATE) from OTHER_DATASET where MAIN_DATASET.<idvariables> = <idvariables) and DATE <= MAIN_DATASET.DATE) as OTHER_DATE

If you have lots of records, then its probably better to join the select max() on rather than sub-querying like this.

Opal | Level 21

Here's a version that is easy when the date is all you are concerned about:


data want;

set B (in=inB keep=patient date) A (in=inA) ;

by patient date;

if first.patient then last_B_date=.;

if inB then  last_B_date = date;

retain last_B_date;

if inA;



If there are more variables coming in from the B data set, the approach still works but it becomes more cumbersome.  There are more variables to copy over and retain.



Registration is open! SAS is returning to Vegas for an AI and analytics experience like no other! Whether you're an executive, manager, end user or SAS partner, SAS Innovate is designed for everyone on your team. Register for just $495 by 12/31/2023.

If you are interested in speaking, there is still time to submit a session idea. More details are posted on the website. 

Register now!

What is Bayesian Analysis?

Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.

Find more tutorials on the SAS Users YouTube channel.

Get the $99 certification deal.jpg



Back in the Classroom!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 5 replies
  • 5 in conversation