Hello all -
I'm looking for some help on how to do multiple multidata hash joins in tandem. Essentially, I'm trying to do two proc sql left joins in a hash join instead.
Here are the tables I'm dealing with:
HAVE_BIG: RECORDS (NAME, RECORD_ID, DATE_OF_SERVICE)
HAVE _SMALL_1: INCOME_LEVEL (NAME, INCOME, START_DATE, END_DATE)
HAVE _SMALL_2: STATUS (NAME, STATUS, START_DATE, END_DATE)
The two small tables may or may not have overlapping dates. In proc sql, I would do it like this:
proc sql; create table FINAL_TABLE as select a.name, a.record, a.date_of_service, b.income, c.status from HAVE_BIG a left join HAVE_SMALL_1 b on a.name = b.name and a.date_of_service between b.start_date and b.end_date left join HAVE_SMALL_2 c on a.name = c.name and a.date_of_service between c.start_date and c.end_date ;quit;
In hash join, I'm doing it like below, but getting duplicate records and an incorrect output. Can anyone help me translate the proc sql statement above into a hash join, or edit the code below?
data FINAL_TABLE;
if _n_ = 1 then do;
declare hash add1(dataset:'HAVE_SMALL_1',multidata: 'Y');
add1.defineKey('NAME');
add1.defineData('START_DATE','END_DATE','INCOME');
add1.defineDone();
declare hash add2(dataset:'HAVE_SMALL_2',multidata: 'Y');
add2.defineKey('NAME');
add2.defineData('START_DATE','END_DATE','STATUS');
add2.defineDone();
end;
set HAVE_BIG;
format START_DATE date9. END_DATE date9. INCOME 10.5;
if add1.find() = 0 then do until (add1.find_next());
if START_DATE le DATE_OF_SERVICE le END_DATE then do;
found=1;
output;
end;
end;
call missing(of INCOME); *full list of values to clear - all of hash data elements;
if not (found) then output;
drop found START_DATE END_DATE;
format START_DATE date9. END_DATE date9. STATUS 10.5;
if add2.find() = 0 then do until (add2.find_next());
if START_DATE le DATE_OF_SERVICE le END_DATE then do;
found=1;
output;
end;
end;
call missing(of STATUS); *full list of values to clear - all of hash data elements;
if not (found) then output;
drop found START_DATE END_DATE;
run;
Thanks!!!
Please provide (as a data step or as a sql query) some data we can use for testing.
April 27 – 30 | Gaylord Texan | Grapevine, Texas
Walk in ready to learn. Walk out ready to deliver. This is the data and AI conference you can't afford to miss.
Register now and lock in 2025 pricing—just $495!
Still thinking about your presentation idea? The submission deadline has been extended to Friday, Nov. 14, at 11:59 p.m. ET.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.