Hi,
I am new to hash tables, I know this is an efficient way to merge tables. I have found articles how to merge two or more tables in one way only left join, right join or inner in one data step, already. However, I wonder if it is possible to merge three or more tables. Two of them like using left join, third inner join by hash tables in one data step?
Thanks.
Yes, it is possible. But if the datasets have variables in common other than the join keys, you'll have to program logic accordingly. SQL is easier in this regard in that you can just specify A.myvar or B.myvar, etc.
I am working with large datasets so, I guess hash tables should be more efficient.
Please note hash tables need to be loaded into memory so the more tables you load the less available memory you will have for each table.
You need to balance any performance gains you might get against the increased coding complexity and any memory limitations you might have.
If you are doing any data lookups in your joins (looking up just one or two columns) you might want to consider using SAS format lookups instead. They are a lot less complicated to code and are just as fast as hash tables.
I have written this code already. It seems to be fast enough,
klasa3 is left join table and klasa1 is inner join table.
data hash_test;
if 0 then set klasa1 klasa2 klasa3;
if _N_= 1 then do;
declare hash merge(dataset:'klasa1');
merge.definekey('name');
merge.definedata(all:'yes');
merge.definedone();
declare hash merge2(dataset:'klasa3');
merge2.definekey('name');
merge2.definedata('test');
merge2.definedone();
end;
do until(eof);
set klasa2 end=eof;
rc=merge2.find();
if rc =0 then do;
if merge.find()=0 then output;
end;
else do;
if merge.find()=0 then do;
call missing (test);
output;
end;
end;
end;
stop;
run;
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.