I want to peform left outer join in hash with following code (so as to reproduce the result of left join in Proc Sql).
data a;
input key d1 d2 d3;
datalines;
1 1 1 1
1 2 2 2
1 3 3 3
2 1 1 1
2 2 2 2
4 1 1 1
4 1 1 1
5 5 5 5
;
data b;
input key d4 d5 d6 other;
datalines;
1 1 1 1 0
2 2 2 2 0
3 3 3 3 0
4 4 4 4 0
;
data test(drop=rc);
if 0 then set A;
if _n_=1 then do;
dcl hash h(dataset:'A',multidata:'yes');
h.definekey('key');
h.definedata('d1','d2','d3');
h.definedone();
end;
set B (keep=key d4 d5 d6);
rc=h.find();
do while(rc=0);
output;
rc=h.find_next();
end;
run;
Based on that A is your lookup table
data new_test1(drop=rc r);
if 0 then set a;
declare hash hh_pat(dataset:"a", multidata: "y");
rc=hh_pat.defineKey("key");
rc=hh_pat.defineData("d1", "d2", "d3");
rc=hh_pat.defineDone();
do until(eof);
set b end=eof;
call missing(d1, d2, d3);
rc=hh_pat.find();
output;
if (rc = 0) then do;
hh_pat.has_next(result: r);
do while(r ne 0);
hh_pat.find_next();
output;
hh_pat.has_next(result: r);
end;
end;
end;
stop;
run;
That should be identical to
proc sql ;
select b.key, a.d1, a.d2, a.d3 , b.d4, b.d5 , b.d6
from b left join a
on b.key=a.key;
quit;
Based on that A is your lookup table
data new_test1(drop=rc r);
if 0 then set a;
declare hash hh_pat(dataset:"a", multidata: "y");
rc=hh_pat.defineKey("key");
rc=hh_pat.defineData("d1", "d2", "d3");
rc=hh_pat.defineDone();
do until(eof);
set b end=eof;
call missing(d1, d2, d3);
rc=hh_pat.find();
output;
if (rc = 0) then do;
hh_pat.has_next(result: r);
do while(r ne 0);
hh_pat.find_next();
output;
hh_pat.has_next(result: r);
end;
end;
end;
stop;
run;
That should be identical to
proc sql ;
select b.key, a.d1, a.d2, a.d3 , b.d4, b.d5 , b.d6
from b left join a
on b.key=a.key;
quit;
@mohamed_zaki. Thanks. I also want this way too.
proc sql ;
select b.key, a.d1, a.d2, a.d3 , b.d4, b.d5 , b.d6
from a left join b
on a.key=b.key;
quit;
As your lookup table B is not having duplicate, then this is easier
data test(drop=rc);
if 0 then set b;
declare hash hh_pat(dataset:"b");
rc=hh_pat.defineKey("key");
rc=hh_pat.defineData("d4", "d5", "d6","other");
rc=hh_pat.defineDone();
do until(eof);
set a end=eof;
call missing(d4, d5, d6, other);
rc=hh_pat.find();
output;
end;
stop;
run;
or you can modify the previous post code if you will have duplicate.
@ mohamed_zaki, I guess we can't do this while keeping data set "a" in hash table, right?
Yes, as A contains duplicate records based on your key. So you will end up with uncomplete left joined data set. Only if you do specify the option multidata="Y". And by that the hash object keeps the multiple values in a list that is associated with the key. Which can be traversed and manipulated by using several methods such as HAS_NEXT or FIND_NEXT as in the first post.
Your code looks good. Except:
data a;
input key d1 d2 d3;
datalines;
1 1 1 1
1 2 2 2
1 3 3 3
2 1 1 1
2 2 2 2
4 1 1 1
4 1 1 1
5 5 5 5
;
data b;
input key d4 d5 d6 other;
datalines;
1 1 1 1 0
2 2 2 2 0
3 3 3 3 0
4 4 4 4 0
;
data test(drop=rc);
if _n_=1 then do;
if 0 then set A;
dcl hash h(dataset:'A',multidata:'yes');
h.definekey('key');
h.definedata('d1','d2','d3');
h.definedone();
end;
call missing(of _all_);
set B (keep=key d4 d5 d6);
rc=h.find();
if rc ne 0 then output;
do while(rc=0);
output;
rc=h.find_next();
end;
run;
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.