How could the following be done using hash join,there are 13 million records in each table and wanted to check if hash would reduce the processing time.
proc sql;
create table _all as
select a.*
,b.CUST_ID_CD
,b.CUST_TYPE_CD
,b.CUST_ACCT_LINK_DT
,b.PHONE1_TYPE_CD
from Amea_before a left outer join amea_after b on a.acct_no=b.acct_no
group by a.acct_no
order by a.acct_no;
quit;
What is the "group by" for here? I don't see any summary functions? Just want to clear the fog before moving on.
Haikuo
Yes, I forgot to comment that out.Thanks for pointing it out
Please bear with me, here are some my 2cents along with a possible Hash() solution:
In general scenario of large datasets merging, you may have the following options:
http://www.sas.com/technologies/dw/storage/spds/index.html
I hope your pocket is deep enough for it.
data _all;
if _n_=1 then do;
if 0 then set amea_after (keep=acct_no CUST_ID_CD CUST_TYPE_CD CUST_ACCT_LINK_DT PHONE1_TYPE_CD);
declare hash h(dataset:'amea_after (keep=acct_no CUST_ID_CD CUST_TYPE_CD CUST_ACCT_LINK_DT PHONE1_TYPE_CD)', multidata:'y');
h.definekey('acct_no');
h.definedata('CUST_ID_CD', 'CUST_TYPE_CD', 'CUST_ACCT_LINK_DT', 'PHONE1_TYPE_CD');
h.definedone();
end;
set Amea_before;
_rc=h.find();
do _rc=0 by 0 while (_rc=0);
output;
_rc=h.find_next();
end;
drop _rc;
run;
Haikuo
Thanks Haiko.I will try that and let you know
It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.