Hi Doc,
quote:"and a real mess with a data step"
Well, not if hash() can be qualified as data step implementation.
Kindly Regards,
Haikuo
I'd do something like this. If there is no chance of multiple matches, stick with new_dataset_ver1 and order it however you want.
If there is a chance that there might be more than one match per ID, look at the code used to obtain new_dataset_ver2. I calculate a variable that rates the closeness of the match. Here I take the distance in days as an absolute value, then select the match with the closest value. I use the monotonic() function which can be tricky if you do more than use it to add row numbers. Here it is an example of how to rank potential matches.
This is untested code, but I do use the technique with time stamped dates. Be careful to set up decision rules (in the order by clause) carefully. The code sample below could force an arbitrary selection between two matches with the same time distance.
proc sql;
create table new_dataset_ver1 as
select
a.ID,
a.examdate as examdate_a,
b.examdate as examdate_b,
b.examdate_p30,
b.examdate_m30,
abs(a.examdate - b.examdate) as difference
from
dataset1 as a,
dataset2 as b
where
a.ID = b.ID and
b.examdate_p30 < a.examdate < b.examdate_m30
order by
ID,
difference;
create table monotonic as
select
*,
monotonic() as line_number
from
new_dataset_ver1;
create table min_linenumber as
select
ID,
min(line_number) as closest_date
from
monotonic | |
group by | |
ID; |
create table new_dataset_ver2 as
select
a.*
from
monotonic as a,
min_linenumber as b
where
a.ID = b.ID and
a.line_number = b.closest_date;
quit;
You can try something like this (untested), it tries to deal properly with missing matches in dataset b :
proc sql;
create table temp as
select a.*, b.examdate as otherDate, b.var1, b.var2,
abs(intck("DAY", a.examdate, b.examdate)) as sepDays
from a left join b on a.id = b.id
where calculated sepDays <= 30
order by a.id, a.examdate, sepDays;
quit;
data want(drop=sepDays);
set temp;
by id examdate;
if first.examdate;
run;
PG
Don't miss out on SAS Innovate - Register now for the FREE Livestream!
Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.