BookmarkSubscribeRSS Feed
derekcra
Fluorite | Level 6

Hello all -

 

I'm looking for some help on how to do multiple multidata hash joins in tandem. Essentially, I'm trying to do two proc sql left joins in a hash join instead.

 

Here are the tables I'm dealing with:

 

HAVE_BIG: RECORDS (NAME, RECORD_ID, DATE_OF_SERVICE)

HAVE _SMALL_1: INCOME_LEVEL (NAME, INCOME, START_DATE, END_DATE)

HAVE _SMALL_2: STATUS (NAME, STATUS, START_DATE, END_DATE)

 

The two small tables may or may not have overlapping dates. In proc sql, I would do it like this:

 

proc sql;
create table FINAL_TABLE as select
	a.name,
	a.record,
	a.date_of_service,
	b.income,
	c.status
from HAVE_BIG a
left join HAVE_SMALL_1 b
on 	a.name = b.name and
	a.date_of_service between b.start_date and b.end_date
left join HAVE_SMALL_2 c
on 	a.name = c.name and
	a.date_of_service between c.start_date and c.end_date
;quit;

 

In hash join, I'm doing it like below, but getting duplicate records and an incorrect output. Can anyone help me translate the proc sql statement above into a hash join, or edit the code below?

 

data FINAL_TABLE;
	if _n_ = 1 then do;
	    declare hash add1(dataset:'HAVE_SMALL_1',multidata: 'Y');
	    add1.defineKey('NAME');
	    add1.defineData('START_DATE','END_DATE','INCOME');
	    add1.defineDone();
	    declare hash add2(dataset:'HAVE_SMALL_2',multidata: 'Y');
	    add2.defineKey('NAME');
	    add2.defineData('START_DATE','END_DATE','STATUS');
	    add2.defineDone();
	end;

	set HAVE_BIG;

	format START_DATE date9. END_DATE date9. INCOME 10.5;
	if add1.find() = 0 then do until (add1.find_next());
	    if START_DATE le DATE_OF_SERVICE le END_DATE then do;
	      found=1;
	      output;
	    end;
	end;
	call missing(of INCOME);  *full list of values to clear - all of hash data elements;
	if not (found) then output;
	drop found START_DATE END_DATE;

	format START_DATE date9. END_DATE date9. STATUS 10.5;
	if add2.find() = 0 then do until (add2.find_next());
	    if START_DATE le DATE_OF_SERVICE le END_DATE then do;
	      found=1;
	      output;
	    end;
	end;
	call missing(of STATUS);  *full list of values to clear - all of hash data elements;
	if not (found) then output;
	drop found START_DATE END_DATE;
run;

 

Thanks!!!

1 REPLY 1
ChrisNZ
Tourmaline | Level 20

Please provide (as a data step or as a sql query) some data we can use for testing.

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 1 reply
  • 1160 views
  • 0 likes
  • 2 in conversation