Hi, I am trying to think of a way to match the closest/nearest datetime in 2 different tables.
First, I will have a master table where it has a datetime column. In my second table, it has a datetime column aswell. However, datetime in Master table and 2nd table NOT NECESSARILY have the same value of datetime that tie to the same ID.
Kindly refer to the code below:
data master; informat master_datetime datetime19.; format master_date datetime19.; input ID master_datetime age ; cards; 1 1/11/11 12:00 11 2 12/15/11 13:00 11 3 12/16/11 3:30 33 4 1/1/12 0:00 55 5 1/16/12 4:50 77 data second; informat secondary_datetime datetime19.; format secondary_datetime datetime19.; input ID secondary_datetime yearsintown ; cards; 1 1/11/11 12:30 11 1 11/1/11 13:00 22 2 12/15/11 13:05 33 2 12/15/11 12:51 55 2 12/15/11 13:03 77
What I want to achieve is that, with the same ID match in 2 tables, only produce the result set that has the closest "secondary_datetime" to "master_datetime".
By looking at the data above, I am only expecting to get the following result:
ID master_datetime Age secondary_datetime yearsintown 1 1/11/11 12:00 11 1/11/11 12:30 11 2 12/15/11 13:00 11 12/15/11 13:03 77
If i were to get the exact datetime match, it has no issue at all. But for now, i need to find the datetime in 2nd table that is closest to the master table.
Can anyone help me with that?
I would merge the data into one dataset, by ID, so that you have all the records in one place, then do date-date to get a difference in datetimes, ensuring to abs() the resulting diff so previous dates are considered. Then its simply a matter of getting the min(diff). I will try to do some code, but I have a meeting shortly, the logic above should get you there though.
Here is an example, please test and post working test data in future:
data master; informat master_datetime datetime19.; format master_datetime datetime19.; id=1 ; master_datetime="01NOV2011 12:00"dt; age=11; run; data second; informat secondary_datetime datetime19.; format secondary_datetime datetime19.; id=1; secondary_datetime="01NOV2011 12:30"dt; age=11; output; id=1; secondary_datetime="11NOV2011 13:00"dt; age=22; output; run; proc sql; create table tot as select a.id, a.master_datetime, a.age, b.secondary_datetime, abs(b.secondary_datetime - a.master_datetime) as diff from master a left join second b on a.id=b.id; create table want as select * from tot group by id having diff=min(diff); quit;
Depending on the whole story you may think of matching by date, neglecting the time,
Otherwise, should the 2nd table datetime always be later than the master table datetime?
If positive you may think of merging data on 1st step and later combine data of two observations.
It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.