BookmarkSubscribeRSS Feed
derekcra
Fluorite | Level 6

Hello all -

 

I'm looking for some help on how to do multiple multidata hash joins in tandem. Essentially, I'm trying to do two proc sql left joins in a hash join instead.

 

Here are the tables I'm dealing with:

 

HAVE_BIG: RECORDS (NAME, RECORD_ID, DATE_OF_SERVICE)

HAVE _SMALL_1: INCOME_LEVEL (NAME, INCOME, START_DATE, END_DATE)

HAVE _SMALL_2: STATUS (NAME, STATUS, START_DATE, END_DATE)

 

The two small tables may or may not have overlapping dates. In proc sql, I would do it like this:

 

proc sql;
create table FINAL_TABLE as select
	a.name,
	a.record,
	a.date_of_service,
	b.income,
	c.status
from HAVE_BIG a
left join HAVE_SMALL_1 b
on 	a.name = b.name and
	a.date_of_service between b.start_date and b.end_date
left join HAVE_SMALL_2 c
on 	a.name = c.name and
	a.date_of_service between c.start_date and c.end_date
;quit;

 

In hash join, I'm doing it like below, but getting duplicate records and an incorrect output. Can anyone help me translate the proc sql statement above into a hash join, or edit the code below?

 

data FINAL_TABLE;
	if _n_ = 1 then do;
	    declare hash add1(dataset:'HAVE_SMALL_1',multidata: 'Y');
	    add1.defineKey('NAME');
	    add1.defineData('START_DATE','END_DATE','INCOME');
	    add1.defineDone();
	    declare hash add2(dataset:'HAVE_SMALL_2',multidata: 'Y');
	    add2.defineKey('NAME');
	    add2.defineData('START_DATE','END_DATE','STATUS');
	    add2.defineDone();
	end;

	set HAVE_BIG;

	format START_DATE date9. END_DATE date9. INCOME 10.5;
	if add1.find() = 0 then do until (add1.find_next());
	    if START_DATE le DATE_OF_SERVICE le END_DATE then do;
	      found=1;
	      output;
	    end;
	end;
	call missing(of INCOME);  *full list of values to clear - all of hash data elements;
	if not (found) then output;
	drop found START_DATE END_DATE;

	format START_DATE date9. END_DATE date9. STATUS 10.5;
	if add2.find() = 0 then do until (add2.find_next());
	    if START_DATE le DATE_OF_SERVICE le END_DATE then do;
	      found=1;
	      output;
	    end;
	end;
	call missing(of STATUS);  *full list of values to clear - all of hash data elements;
	if not (found) then output;
	drop found START_DATE END_DATE;
run;

 

Thanks!!!

1 REPLY 1
ChrisNZ
Tourmaline | Level 20

Please provide (as a data step or as a sql query) some data we can use for testing.

hackathon24-white-horiz.png

The 2025 SAS Hackathon Kicks Off on June 11!

Watch the live Hackathon Kickoff to get all the essential information about the SAS Hackathon—including how to join, how to participate, and expert tips for success.

YouTube LinkedIn

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 1 reply
  • 1029 views
  • 0 likes
  • 2 in conversation