Multiple Multidata hash joins

derekcra — Wed, 17 Feb 2021 19:57:12 GMT

Hello all -

I'm looking for some help on how to do multiple multidata hash joins in tandem. Essentially, I'm trying to do two proc sql left joins in a hash join instead.

Here are the tables I'm dealing with:

HAVE_BIG: RECORDS (NAME, RECORD_ID, DATE_OF_SERVICE)

HAVE _SMALL_1: INCOME_LEVEL (NAME, INCOME, START_DATE, END_DATE)

HAVE _SMALL_2: STATUS (NAME, STATUS, START_DATE, END_DATE)

The two small tables may or may not have overlapping dates. In proc sql, I would do it like this:

proc sql;
create table FINAL_TABLE as select
	a.name,
	a.record,
	a.date_of_service,
	b.income,
	c.status
from HAVE_BIG a
left join HAVE_SMALL_1 b
on 	a.name = b.name and
	a.date_of_service between b.start_date and b.end_date
left join HAVE_SMALL_2 c
on 	a.name = c.name and
	a.date_of_service between c.start_date and c.end_date
;quit;

In hash join, I'm doing it like below, but getting duplicate records and an incorrect output. Can anyone help me translate the proc sql statement above into a hash join, or edit the code below?

data FINAL_TABLE;
	if _n_ = 1 then do;
	    declare hash add1(dataset:'HAVE_SMALL_1',multidata: 'Y');
	    add1.defineKey('NAME');
	    add1.defineData('START_DATE','END_DATE','INCOME');
	    add1.defineDone();
	    declare hash add2(dataset:'HAVE_SMALL_2',multidata: 'Y');
	    add2.defineKey('NAME');
	    add2.defineData('START_DATE','END_DATE','STATUS');
	    add2.defineDone();
	end;

	set HAVE_BIG;

	format START_DATE date9. END_DATE date9. INCOME 10.5;
	if add1.find() = 0 then do until (add1.find_next());
	    if START_DATE le DATE_OF_SERVICE le END_DATE then do;
	      found=1;
	      output;
	    end;
	end;
	call missing(of INCOME);  *full list of values to clear - all of hash data elements;
	if not (found) then output;
	drop found START_DATE END_DATE;

	format START_DATE date9. END_DATE date9. STATUS 10.5;
	if add2.find() = 0 then do until (add2.find_next());
	    if START_DATE le DATE_OF_SERVICE le END_DATE then do;
	      found=1;
	      output;
	    end;
	end;
	call missing(of STATUS);  *full list of values to clear - all of hash data elements;
	if not (found) then output;
	drop found START_DATE END_DATE;
run;

Thanks!!!

Re: Multiple Multidata hash joins

ChrisNZ — Thu, 18 Feb 2021 02:56:02 GMT

Please provide (as a data step or as a sql query) some data we can use for testing.

topic Multiple Multidata hash joins in SAS Programming

Multiple Multidata hash joins

Re: Multiple Multidata hash joins