BookmarkSubscribeRSS Feed
tommy81
Obsidian | Level 7

Hi

I need help to create a hash join equivalent to using two left joins , on total 3 tables like

Proc sql;

create table test as select * from

Table1 as A  left join

Table2 as B on (A.id=B.id)    left join

Table3 as  on (A.key=C.Key)

; quit;

For one table the following would work

assuming Table  T1 >  Table T2

data SET1;

     if 0 then set Table2 ;

declare hash VS(hashexp:7, dataset:Table2 );

VS.definekey('KEY'');

VS.definedata(all:'Y');

VS.definedone();

do until(eof);

set Table1 end=eof;  

   if VS.find()=0 then output;

else do;

call missing(of T1-T2);

output;

end;

end;

stop; run;

5 REPLIES 5
Daryl
SAS Employee

You can have multiple hash objects in one data step.

ChrisNZ
Tourmaline | Level 20

Unsure what the issue is.

This should do what you want and be faster than SQL, provided the tables fit in memory and have unique keys of course.

 

data SET1;
  set TABLE1 ;  
  if _N_=1 then do;
    declare hash VS( dataset:'TABLE2' );
    VS.definekey('KEY');
    VS.definedata(all:'Y');
    VS.definedone();

    declare hash VS1( dataset:'TABLE3' );
    VS1.definekey('KEY');
    VS1.definedata(all:'Y');
    VS1.definedone();
    if 0 then set TABLE2 TABLE3;
  end;
  RC=VS.find();
  RC=VS1.find();
  drop RC;
proc print;run;


To ensure the hash table variables are set to missing before retrieval, use the --  variable list shortcut.

data TABLE1;
  array B {5};
  KEY=1; B1=2;  output;
  KEY=2; B1=.;  output;

data TABLE2;
  array C {5};
  KEY=1; C1=2;  output;
  KEY=2; C1=.;  output;

data TABLE3;
  array D {5};
  KEY=1; D1=2;  output;
  KEY=2; D1=4;  output;


data SET1;
  set TABLE1 ;  
  if _N_=1 then do;
    declare hash VS( dataset:'TABLE2' );
    VS.definekey('KEY');
    VS.definedata(all:'Y');
    VS.definedone();

    declare hash VS1( dataset:'TABLE3' );
    VS1.definekey('KEY');
    VS1.definedata(all:'Y');
    VS1.definedone();
    if 0 then set TABLE2 TABLE3;
  end;
  call missing(of C1--D5);
  RC=VS.find();
  RC=VS1.find();
  drop RC;
proc print;run;

Ksharp
Super User

I am very interesting about your questio.

But since you did not post some sample data,so it is hard to give a exact code. This is an example.

Whether it is what you want.

Maybe in the future, I will use Hash Table to make left join and Product Cartesian . Thank you . Smiley Happy

data class1(keep=name sex age) ;
 set sashelp.class;
run;
data class2(keep=sex weight rename=(sex=_sex));
 set sashelp.class;
 if ranuni(-1) le .8;
run;
data class3(keep=name height rename=(name=_name));
 set sashelp.class;
 if ranuni(-1) ge .8;
run;
data _want(drop=a1 _: rc found);
 set class1;
if _n_ eq 1 then do;
 declare hash ha1(hashexp:10);
 declare hiter _ha1('ha1');
  ha1.definekey('_sex','a1');
  ha1.definedata('weight','_sex');
  ha1.definedone();

do until(last);
 set class2 end=last;
 a1+1;
 ha1.add();
end;

end;



rc=_ha1.first();
do while(not rc);
rc=_ha1.next();
if sex=_sex then do;output; found=1;call missing(a1,_sex,weight);end;
end;
if not found then do;call missing(a1,_sex,weight);output;end;

run; 
data want(drop= a2 _: rc found);
 set _want;
if _n_ eq 1 then do;

 declare hash ha2(hashexp:10);
 declare hiter _ha2('ha2');
  ha2.definekey('_name','a2');
  ha2.definedata('height','_name');
  ha2.definedone();


do until(_last);
 set class3 end=_last;
 a2+1;
 ha2.add();
end;

end;


rc=_ha2.first();
do while(not rc);
rc=_ha2.next();
if name=_name then do;output;found=1;call missing(a2,_name,height);end;
end;
if not found then do;call missing(a2,_name,height);output;end;

run; 

Ksharp

juan1mario
Calcite | Level 5

@Ksharp is there a way you can do this multi left join using sas hash object, but without reading in datasets multiple times inorder to rename variables? My thought is that you have 3 datasets and they're all joined on a composite key of the same 2 variables in each dataset. How could I do this without renaming variables? Thanks 

Ksharp
Super User

This post is very old.

Please start a new session ,let more people to see it.You could get better answer.

 

"How could I do this without renaming variables?"

I think could use this kind of statement.

 

h.find(key:var1,key:var2)

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 5 replies
  • 5945 views
  • 1 like
  • 5 in conversation