Hi. Until today I had been told both imb_code and spm_calc_batch_date would be match between both my datasets, so the below Merge made sense. However now the requirement has been changed and I'm honestly stumped.
Imb_code will match imb_code, but now the IV dataset's version of spm_calc_batch_date can differ to within 45 days of the BIDS's dataset version spm_calc_batch_date. This is because imb_codes get recycled after 45 days and my dataset contains over 90 days of data.
So the bottom line is I need to merge by imb_code and spm_calc_batch_date (within a 45 day window). Is this even possible?
Any help would be greatly appreciated!
* BIDS Version ;
proc sort data=bids_ora.bi_spm_piece_recon out=BIDS_bi_spm_piece_recon;
by imb_code spm_calc_batch_date;
run;
*IV Version ;
proc sort data=iv_ora.bi_spm_piece_recon out=IV_bi_spm_piece_recon;
by imb_code spm_calc_batch_date;
run;
* RULE: PIECES MISSING IN IV QUERY ;
data QueryData;
merge BIDS_bi_spm_piece_recon (in=a)
IV_bi_spm_piece_recon (in=b);
by imb_code spm_calc_batch_date;
if a and not b;
run;
SQL is really good at this type of merge. You can play with the join condition with calculations.
I'm not sure how the a not b would end up resolving in a situation like this. If the record is in A, not B there wouldn't be dates to join on to check the date interval?
Here's a start. Mock up some test data and play around until you get what you need.
SQL Join types are documented here: http://www.codeproject.com/Articles/33052/Visual-Representation-of-SQL-Joins
proc sql;
create table QueryData as
select *
from BIDS_bi_spm_piece_recon as a
left join IV_bi_spm_piece_recon as b
on a.imb_code=b.imb_code
and a.spm_calc_batch_date - b.spm_calc_batch_date between 0 and 45;
quit;
Hi mate.
This might make sense: Joining two datasets with the closest observation in terms of time stamp or
Combine datasets based on inexact time.
A One-Step Method for Finding the Closest Match
Possible to merge by dates within a range?
Hope this helps
SQL is really good at this type of merge. You can play with the join condition with calculations.
I'm not sure how the a not b would end up resolving in a situation like this. If the record is in A, not B there wouldn't be dates to join on to check the date interval?
Here's a start. Mock up some test data and play around until you get what you need.
SQL Join types are documented here: http://www.codeproject.com/Articles/33052/Visual-Representation-of-SQL-Joins
proc sql;
create table QueryData as
select *
from BIDS_bi_spm_piece_recon as a
left join IV_bi_spm_piece_recon as b
on a.imb_code=b.imb_code
and a.spm_calc_batch_date - b.spm_calc_batch_date between 0 and 45;
quit;
I agree with Reeza.
Yes, by using between - and join criteria in SQL, given I understood the matching criteria correctly.
Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.
Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.