BookmarkSubscribeRSS Feed
ysk
Calcite | Level 5 ysk
Calcite | Level 5

Hi SAS community,

 

I have somewhat complex database where I'm trying to link Database A to B where A is my primary database.

 

I'm trying to find the last date from B [B_date] that is associated with date from A. [A_date]

 

For example,

 

Patient A has consecutive visits (Jan 1st, Jan 2nd, Jan 3rd and so on) and I want to merge with Database B and show the last B_date. (i.e. minimum # of difference between A_date and B_date). Since there are multiple visits in B, it's showing different B_date for each of the A_date. I'm getting results like below,

 

Patient A |  A_date1  | Last_B_Date1

Patient A |  A_date2  | Last_B_Date2

Patient A |  A_date3  | Last_B_Date3

Patient A |  A_date4  | Last_B_Date4

 

I want the results to show like 

Patient A |  A_date1  | Last_B_Date1

Patient A |  A_date2  | Last_B_Date1

Patient A |  A_date3  | Last_B_Date1

Patient A |  A_date4  | Last_B_Date1

 

Hope this makes sense.

 

 

 

5 REPLIES 5
Reeza
Super User

How are you doing your merge?

 

Is there an order to the date, or something to identify the first record beyond being first, ie is it the smallest, largest?

 

If via SQL you could consider using a function such as Min() to obtain the value.

If via data step a BY processing group could identify the first record and retain the date.

ysk
Calcite | Level 5 ysk
Calcite | Level 5
forgot to mention that the condition I use is that b_date is earlier than a_date and I just want to extract the last date that meets this criteria.
I'm using left join.
Ksharp
Super User

Are you working for some CRO company and do some SDTM job ?

Post some dummy data and the output you need , that would be better to let others know what you want.

RW9
Diamond | Level 26 RW9
Diamond | Level 26

Well, on your logic alone something like the below should work.  However it is a good idea to post test data - in the form of a datastep - and required output for a more complete answer:

proc sql;
  create table WANT as
  select  MAIN_DATASET.*,
          (select max(DATE) from OTHER_DATASET where MAIN_DATASET.<idvariables> = <idvariables) and DATE <= MAIN_DATASET.DATE) as OTHER_DATE
  from    MAIN_DATASET MAIN_DATASET;
quit;

If you have lots of records, then its probably better to join the select max() on rather than sub-querying like this.

Astounding
PROC Star

Here's a version that is easy when the date is all you are concerned about:

 

data want;

set B (in=inB keep=patient date) A (in=inA) ;

by patient date;

if first.patient then last_B_date=.;

if inB then  last_B_date = date;

retain last_B_date;

if inA;

run;

 

If there are more variables coming in from the B data set, the approach still works but it becomes more cumbersome.  There are more variables to copy over and retain.

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

What is Bayesian Analysis?

Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 5 replies
  • 1024 views
  • 0 likes
  • 5 in conversation