BookmarkSubscribeRSS Feed
derekcra
Fluorite | Level 6

Hello all -

 

I'm looking for some help on how to do multiple multidata hash joins in tandem. Essentially, I'm trying to do two proc sql left joins in a hash join instead.

 

Here are the tables I'm dealing with:

 

HAVE_BIG: RECORDS (NAME, RECORD_ID, DATE_OF_SERVICE)

HAVE _SMALL_1: INCOME_LEVEL (NAME, INCOME, START_DATE, END_DATE)

HAVE _SMALL_2: STATUS (NAME, STATUS, START_DATE, END_DATE)

 

The two small tables may or may not have overlapping dates. In proc sql, I would do it like this:

 

proc sql;
create table FINAL_TABLE as select
	a.name,
	a.record,
	a.date_of_service,
	b.income,
	c.status
from HAVE_BIG a
left join HAVE_SMALL_1 b
on 	a.name = b.name and
	a.date_of_service between b.start_date and b.end_date
left join HAVE_SMALL_2 c
on 	a.name = c.name and
	a.date_of_service between c.start_date and c.end_date
;quit;

 

In hash join, I'm doing it like below, but getting duplicate records and an incorrect output. Can anyone help me translate the proc sql statement above into a hash join, or edit the code below?

 

data FINAL_TABLE;
	if _n_ = 1 then do;
	    declare hash add1(dataset:'HAVE_SMALL_1',multidata: 'Y');
	    add1.defineKey('NAME');
	    add1.defineData('START_DATE','END_DATE','INCOME');
	    add1.defineDone();
	    declare hash add2(dataset:'HAVE_SMALL_2',multidata: 'Y');
	    add2.defineKey('NAME');
	    add2.defineData('START_DATE','END_DATE','STATUS');
	    add2.defineDone();
	end;

	set HAVE_BIG;

	format START_DATE date9. END_DATE date9. INCOME 10.5;
	if add1.find() = 0 then do until (add1.find_next());
	    if START_DATE le DATE_OF_SERVICE le END_DATE then do;
	      found=1;
	      output;
	    end;
	end;
	call missing(of INCOME);  *full list of values to clear - all of hash data elements;
	if not (found) then output;
	drop found START_DATE END_DATE;

	format START_DATE date9. END_DATE date9. STATUS 10.5;
	if add2.find() = 0 then do until (add2.find_next());
	    if START_DATE le DATE_OF_SERVICE le END_DATE then do;
	      found=1;
	      output;
	    end;
	end;
	call missing(of STATUS);  *full list of values to clear - all of hash data elements;
	if not (found) then output;
	drop found START_DATE END_DATE;
run;

 

Thanks!!!

1 REPLY 1
ChrisNZ
Tourmaline | Level 20

Please provide (as a data step or as a sql query) some data we can use for testing.

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 1 reply
  • 573 views
  • 0 likes
  • 2 in conversation