DATA Step, Macro, Functions and more

Big Merge Troubles

Accepted Solution Solved
Reply
Regular Contributor
Posts: 212
Accepted Solution

Big Merge Troubles

[ Edited ]

Hi. Until today I had been told both imb_code and spm_calc_batch_date would be match between both my datasets, so the below Merge made sense.  However now the requirement has been changed and I'm honestly stumped.  

 

Imb_code will match imb_code, but now the IV dataset's version of spm_calc_batch_date  can differ to within 45 days of the BIDS's dataset version spm_calc_batch_date. This is because imb_codes get recycled after 45 days and my dataset contains over 90 days of data.

 

So the bottom line is I need to merge by imb_code and spm_calc_batch_date (within a 45 day window).  Is this even possible?

 

Any help would be greatly appreciated! 

 

* BIDS Version ;
proc sort data=bids_ora.bi_spm_piece_recon out=BIDS_bi_spm_piece_recon; by imb_code spm_calc_batch_date; run;
*IV Version ; proc sort data=iv_ora.bi_spm_piece_recon out=IV_bi_spm_piece_recon; by imb_code spm_calc_batch_date; run; * RULE: PIECES MISSING IN IV QUERY ; data QueryData; merge BIDS_bi_spm_piece_recon (in=a) IV_bi_spm_piece_recon (in=b); by imb_code spm_calc_batch_date; if a and not b; run;

 


Accepted Solutions
Solution
‎05-10-2016 02:03 PM
Super User
Posts: 17,784

Re: Big Merge Troubles

SQL is really good at this type of merge. You can play with the join condition with calculations.

I'm not sure how the a not b would end up resolving in a situation like this. If the record is in A, not B there wouldn't be dates to join on to check the date interval?

 

Here's a start. Mock up some test data and play around until you get what you need. 

SQL Join types are documented here: http://www.codeproject.com/Articles/33052/Visual-Representation-of-SQL-Joins

 

proc sql;
create table QueryData as
select *
from BIDS_bi_spm_piece_recon as a
left join IV_bi_spm_piece_recon as b
on a.imb_code=b.imb_code
and a.spm_calc_batch_date - b.spm_calc_batch_date between 0 and 45;
quit;

View solution in original post


All Replies
Regular Contributor
Posts: 212
Solution
‎05-10-2016 02:03 PM
Super User
Posts: 17,784

Re: Big Merge Troubles

SQL is really good at this type of merge. You can play with the join condition with calculations.

I'm not sure how the a not b would end up resolving in a situation like this. If the record is in A, not B there wouldn't be dates to join on to check the date interval?

 

Here's a start. Mock up some test data and play around until you get what you need. 

SQL Join types are documented here: http://www.codeproject.com/Articles/33052/Visual-Representation-of-SQL-Joins

 

proc sql;
create table QueryData as
select *
from BIDS_bi_spm_piece_recon as a
left join IV_bi_spm_piece_recon as b
on a.imb_code=b.imb_code
and a.spm_calc_batch_date - b.spm_calc_batch_date between 0 and 45;
quit;
Regular Contributor
Posts: 212

Re: Big Merge Troubles

I agree with Reeza.

Super User
Posts: 5,255

Re: Big Merge Troubles

Yes, by using between - and join criteria in SQL, given I understood the matching criteria correctly.

Data never sleeps
☑ This topic is SOLVED.

Need further help from the community? Please ask a new question.

Discussion stats
  • 4 replies
  • 297 views
  • 5 likes
  • 4 in conversation