04-14-2010 09:33 AM
04-14-2010 02:11 PM
04-14-2010 04:18 PM
07-21-2012 07:09 PM
quote:"and a real mess with a data step"
Well, not if hash() can be qualified as data step implementation.
07-20-2012 11:22 AM
I'd do something like this. If there is no chance of multiple matches, stick with new_dataset_ver1 and order it however you want.
If there is a chance that there might be more than one match per ID, look at the code used to obtain new_dataset_ver2. I calculate a variable that rates the closeness of the match. Here I take the distance in days as an absolute value, then select the match with the closest value. I use the monotonic() function which can be tricky if you do more than use it to add row numbers. Here it is an example of how to rank potential matches.
This is untested code, but I do use the technique with time stamped dates. Be careful to set up decision rules (in the order by clause) carefully. The code sample below could force an arbitrary selection between two matches with the same time distance.
create table new_dataset_ver1 as
a.examdate as examdate_a,
b.examdate as examdate_b,
abs(a.examdate - b.examdate) as difference
dataset1 as a,
dataset2 as b
a.ID = b.ID and
b.examdate_p30 < a.examdate < b.examdate_m30
create table monotonic as
monotonic() as line_number
create table min_linenumber as
min(line_number) as closest_date
create table new_dataset_ver2 as
monotonic as a,
min_linenumber as b
a.ID = b.ID and
a.line_number = b.closest_date;
07-21-2012 09:50 PM
You can try something like this (untested), it tries to deal properly with missing matches in dataset b :
create table temp as
select a.*, b.examdate as otherDate, b.var1, b.var2,
abs(intck("DAY", a.examdate, b.examdate)) as sepDays
from a left join b on a.id = b.id
where calculated sepDays <= 30
order by a.id, a.examdate, sepDays;
by id examdate;