DATA Step, Macro, Functions and more

Big Merge Troubles

Accepted Solution Solved
Reply
Regular Contributor
Posts: 212
Accepted Solution

Big Merge Troubles

[ Edited ]

Hi. Until today I had been told both imb_code and spm_calc_batch_date would be match between both my datasets, so the below Merge made sense.  However now the requirement has been changed and I'm honestly stumped.  

 

Imb_code will match imb_code, but now the IV dataset's version of spm_calc_batch_date  can differ to within 45 days of the BIDS's dataset version spm_calc_batch_date. This is because imb_codes get recycled after 45 days and my dataset contains over 90 days of data.

 

So the bottom line is I need to merge by imb_code and spm_calc_batch_date (within a 45 day window).  Is this even possible?

 

Any help would be greatly appreciated! 

 

* BIDS Version ;
proc sort data=bids_ora.bi_spm_piece_recon out=BIDS_bi_spm_piece_recon; by imb_code spm_calc_batch_date; run;
*IV Version ; proc sort data=iv_ora.bi_spm_piece_recon out=IV_bi_spm_piece_recon; by imb_code spm_calc_batch_date; run; * RULE: PIECES MISSING IN IV QUERY ; data QueryData; merge BIDS_bi_spm_piece_recon (in=a) IV_bi_spm_piece_recon (in=b); by imb_code spm_calc_batch_date; if a and not b; run;

 


Accepted Solutions
Solution
‎05-10-2016 02:03 PM
Super User
Posts: 19,820

Re: Big Merge Troubles

Posted in reply to buechler66

SQL is really good at this type of merge. You can play with the join condition with calculations.

I'm not sure how the a not b would end up resolving in a situation like this. If the record is in A, not B there wouldn't be dates to join on to check the date interval?

 

Here's a start. Mock up some test data and play around until you get what you need. 

SQL Join types are documented here: http://www.codeproject.com/Articles/33052/Visual-Representation-of-SQL-Joins

 

proc sql;
create table QueryData as
select *
from BIDS_bi_spm_piece_recon as a
left join IV_bi_spm_piece_recon as b
on a.imb_code=b.imb_code
and a.spm_calc_batch_date - b.spm_calc_batch_date between 0 and 45;
quit;

View solution in original post


All Replies
Regular Contributor
Posts: 212
Solution
‎05-10-2016 02:03 PM
Super User
Posts: 19,820

Re: Big Merge Troubles

Posted in reply to buechler66

SQL is really good at this type of merge. You can play with the join condition with calculations.

I'm not sure how the a not b would end up resolving in a situation like this. If the record is in A, not B there wouldn't be dates to join on to check the date interval?

 

Here's a start. Mock up some test data and play around until you get what you need. 

SQL Join types are documented here: http://www.codeproject.com/Articles/33052/Visual-Representation-of-SQL-Joins

 

proc sql;
create table QueryData as
select *
from BIDS_bi_spm_piece_recon as a
left join IV_bi_spm_piece_recon as b
on a.imb_code=b.imb_code
and a.spm_calc_batch_date - b.spm_calc_batch_date between 0 and 45;
quit;
Regular Contributor
Posts: 212

Re: Big Merge Troubles

I agree with Reeza.

Super User
Posts: 5,431

Re: Big Merge Troubles

Posted in reply to buechler66

Yes, by using between - and join criteria in SQL, given I understood the matching criteria correctly.

Data never sleeps
☑ This topic is solved.

Need further help from the community? Please ask a new question.

Discussion stats
  • 4 replies
  • 325 views
  • 5 likes
  • 4 in conversation