Hi. Until today I had been told both imb_code and spm_calc_batch_date would be match between both my datasets, so the below Merge made sense. However now the requirement has been changed and I'm honestly stumped.
Imb_code will match imb_code, but now the IV dataset's version of spm_calc_batch_date can differ to within 45 days of the BIDS's dataset version spm_calc_batch_date. This is because imb_codes get recycled after 45 days and my dataset contains over 90 days of data.
So the bottom line is I need to merge by imb_code and spm_calc_batch_date (within a 45 day window). Is this even possible?
Any help would be greatly appreciated!
* BIDS Version ;
proc sort data=bids_ora.bi_spm_piece_recon out=BIDS_bi_spm_piece_recon;
by imb_code spm_calc_batch_date;
run;
*IV Version ;
proc sort data=iv_ora.bi_spm_piece_recon out=IV_bi_spm_piece_recon;
by imb_code spm_calc_batch_date;
run;
* RULE: PIECES MISSING IN IV QUERY ;
data QueryData;
merge BIDS_bi_spm_piece_recon (in=a)
IV_bi_spm_piece_recon (in=b);
by imb_code spm_calc_batch_date;
if a and not b;
run;
SQL is really good at this type of merge. You can play with the join condition with calculations.
I'm not sure how the a not b would end up resolving in a situation like this. If the record is in A, not B there wouldn't be dates to join on to check the date interval?
Here's a start. Mock up some test data and play around until you get what you need.
SQL Join types are documented here: http://www.codeproject.com/Articles/33052/Visual-Representation-of-SQL-Joins
proc sql;
create table QueryData as
select *
from BIDS_bi_spm_piece_recon as a
left join IV_bi_spm_piece_recon as b
on a.imb_code=b.imb_code
and a.spm_calc_batch_date - b.spm_calc_batch_date between 0 and 45;
quit;
Hi mate.
This might make sense: Joining two datasets with the closest observation in terms of time stamp or
Combine datasets based on inexact time.
A One-Step Method for Finding the Closest Match
Possible to merge by dates within a range?
Hope this helps
SQL is really good at this type of merge. You can play with the join condition with calculations.
I'm not sure how the a not b would end up resolving in a situation like this. If the record is in A, not B there wouldn't be dates to join on to check the date interval?
Here's a start. Mock up some test data and play around until you get what you need.
SQL Join types are documented here: http://www.codeproject.com/Articles/33052/Visual-Representation-of-SQL-Joins
proc sql;
create table QueryData as
select *
from BIDS_bi_spm_piece_recon as a
left join IV_bi_spm_piece_recon as b
on a.imb_code=b.imb_code
and a.spm_calc_batch_date - b.spm_calc_batch_date between 0 and 45;
quit;
I agree with Reeza.
Yes, by using between - and join criteria in SQL, given I understood the matching criteria correctly.
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.