I'm trying to do a data step with multiple hash joins, and the main criteria is that the first hash join has multiple records and will grow my initial dataset. After I join that, I want to continue with that larger dataset and do more 1-to-1 hash joins with additional logic - however, i'm having trouble with this second step.
Here is some example code (ideally, I would see field_4 in my final output):
data data1;
infile datalines delimiter='|';
format
field_1 $10.
field_2 $10.
;
input
field_1: $10.
field_2: $10.
;
datalines;
A|M
B|F
C|F
run;
data data2;
infile datalines delimiter='|';
format
field_1 $10.
field_3 $10.
;
input
field_1: $10.
field_3: $10.
;
datalines;
A|A
B|A
B|B
B|C
C|A
C|D
run;
data data3;
infile datalines delimiter='|';
format
field_3 $10.
field_4 5.
;
input
field_3: $10.
field_4: 5.
;
datalines;
A|1
B|4
C|7
D|9
run;
data hash_join_template;
if _n_ = 1 then do;
declare hash add1(dataset:'data2',multidata:"Y");
add1.defineKey('field_1');
add1.defineData('field_3');
add1.defineDone();
declare hash add2(dataset:'data3');
add2.defineKey('field_3');
add2.defineData('field_4');
add2.defineDone();
end;
set data1;
/* Do multidata join and output duplicate records */
format field_3 $10.;
rc_join1 = add1.find();
output;
do while (rc_join1 = 0);
rc_join1 = add1.find_next();
if rc_join1=0 then output;
end;
drop rc_join1;
/* Attempting to then take my larger-than-original dataset and continue doing logic/joins, but is blank */
format field_4 5.;
rc_join2 = add2.find();
drop rc_join2;
run;
Anyone see where I'm going wrong? Thank you ahead of time!
@derekcra your overall problem is that you want to do an iterative lookup over add1 and for each lookup in add1, do a lookup in add2 (as I interpret your code).
Therefore, you should have the add2.find() Method call inside the loop.
Something like this will do
data hash_join_template;
if _n_ = 1 then do;
declare hash add1(dataset:'data2',multidata:"Y");
add1.defineKey('field_1');
add1.defineData('field_3');
add1.defineDone();
declare hash add2(dataset:'data3');
add2.defineKey('field_3');
add2.defineData('field_4');
add2.defineDone();
end;
set data1;
format field_3 $10.;
format field_4 5.;
do while (add1.do_over() = 0);
rc_join2 = add2.find();
output;
end;
drop rc_join2;
run;
Result:
field_1 field_2 field_3 field_4 A M A 1 B F A 1 B F B 4 B F C 7 C F A 1 C F D 9
@derekcra your overall problem is that you want to do an iterative lookup over add1 and for each lookup in add1, do a lookup in add2 (as I interpret your code).
Therefore, you should have the add2.find() Method call inside the loop.
Something like this will do
data hash_join_template;
if _n_ = 1 then do;
declare hash add1(dataset:'data2',multidata:"Y");
add1.defineKey('field_1');
add1.defineData('field_3');
add1.defineDone();
declare hash add2(dataset:'data3');
add2.defineKey('field_3');
add2.defineData('field_4');
add2.defineDone();
end;
set data1;
format field_3 $10.;
format field_4 5.;
do while (add1.do_over() = 0);
rc_join2 = add2.find();
output;
end;
drop rc_join2;
run;
Result:
field_1 field_2 field_3 field_4 A M A 1 B F A 1 B F B 4 B F C 7 C F A 1 C F D 9
Ah I see - thank you! I will play around with this but it makes sense offhand. Thanks again.
Don’t miss the livestream kicking off May 7. It’s free. It’s easy. And it’s the best seat in the house.
Join us virtually with our complimentary SAS Innovate Digital Pass. Watch live or on-demand in multiple languages, with translations available to help you get the most out of every session.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.