Help using Base SAS procedures

Merge/Hash/SQL Join

Reply
Occasional Contributor
Posts: 11

Merge/Hash/SQL Join

Hello,

I'm trying to learn some hash object....The original was written in a merge, I tried proc sql and it is acceptable using left join, just wondering how to get the left join using hash object..


data address;
addr_key=10000001;
address='test address1';
output;
addr_key=10000002;
address='test address2';
output;
addr_key=10000003;
address='test address3';
output;
run;

proc print u;
title 'address';
run;


data provzip;
addr_key=10000001;
postl_cd='87048';
output;
addr_key=10000002;
postl_cd='39735';
output;
run;

proc print u;
title 'provzip';
run;


data hash1 (drop=rc);
length postl_cd $5;
declare AssociativeArray hh () ;
rc = hh.DefineKey ( 'addr_key' ) ;
rc = hh.DefineData ( 'addr_key', 'address' ) ;
rc = hh.DefineDone () ;
call missing(postl_cd);
do until ( eof1 ) ;
set address end = eof1 ;
rc = hh.add () ;
end ;
do until ( eof2 ) ;
set provzip end = eof2 ;
rc = hh.find () ;
if rc ^=0 then postl_cd=' ';
if rc =0 then output;
end ;
run ;

proc print data=hash1;
title 'hash1';
run;

data hash2;
if _n_ = 1 then do;
if 0 then set address;
declare hash lookup(dataset:'address');
lookup.defineKey('addr_key');
lookup.definedata('address');
lookup.defineDone();
end;

set provzip;
if lookup.find(key:addr_key) ne 0 then postl_cd=' ';
run;


PROC PRINT;
title 'hash2';
RUN;

proc sql;
create table sql1 as
select *
from address a
left join provzip b
on a.addr_key=b.addr_key
;
quit;
run;

proc print u;
title 'sql';
run;
Super Contributor
Super Contributor
Posts: 3,174

Re: Merge/Hash/SQL Join

Have you searched the SAS support http://support.sas.com/ website.

Scott Barry
SBBWorks, Inc.

Suggested Google advanced search argument, this topic/post:

left join replace with hash object site:sas.com
N/A
Posts: 0

Re: Merge/Hash/SQL Join

You seem to have the basic idea. To get what you want use the logic of the hash1 (or hash2) but put the provzip data set into the hash table and iterate over the address data set.

The logic you're using only gets rows for which you have a postal code; doing it the other way will get all addresses, with postal code missing if the lookup fails.

Use:

> if rc ^=0 then postl_cd=' ';
> if rc =0 then output;

but take the "if rc = 0" off the second statement.

Note: it's usual with a hash lookup like this to put the small data set into the hash table. This beats merge by not requiring the large data set to be sorted. Proc SQL is a viable alternative; only a test will tell which is better.

Cheers,

Jonathan


> Hello,
>
> I'm trying to learn some hash object....The original
> was written in a merge, I tried proc sql and it is
> acceptable using left join, just wondering how to get
> the left join using hash object..
>
>
> data address;
> addr_key=10000001;
> address='test address1';
> output;
> addr_key=10000002;
> address='test address2';
> output;
> addr_key=10000003;
> address='test address3';
> output;
> n;
>
> proc print u;
> title 'address';
> n;
>
>
> data provzip;
> addr_key=10000001;
> postl_cd='87048';
> output;
> addr_key=10000002;
> postl_cd='39735';
> output;
> run;
>
> proc print u;
> title 'provzip';
> un;
>
>
> data hash1 (drop=rc);
> length postl_cd $5;
> declare AssociativeArray hh () ;
> rc = hh.DefineKey ( 'addr_key' ) ;
> rc = hh.DefineData ( 'addr_key', 'address' ) ;
> rc = hh.DefineDone () ;
> call missing(postl_cd);
> do until ( eof1 ) ;
> set address end = eof1 ;
> rc = hh.add () ;
> end ;
> do until ( eof2 ) ;
> set provzip end = eof2 ;
> rc = hh.find () ;
> if rc ^=0 then postl_cd=' ';
> if rc =0 then output;
> end ;
> run ;
>
> proc print data=hash1;
> title 'hash1';
> n;
>
> data hash2;
> if _n_ = 1 then do;
> if 0 then set address;
> declare hash lookup(dataset:'address');
> lookup.defineKey('addr_key');
> lookup.definedata('address');
> lookup.defineDone();
> end;
>
> set provzip;
> if lookup.find(key:addr_key) ne 0 then postl_cd='
> ';
>
>
>
> PROC PRINT;
> title 'hash2';
> N;
>
> proc sql;
> create table sql1 as
> select *
> from address a
> left join provzip b
> on a.addr_key=b.addr_key
> ;
> quit;
> un;
>
> proc print u;
> title 'sql';
> n;
Occasional Contributor
Posts: 11

Re: Merge/Hash/SQL Join

Thank you.

I switched the datasets, and had to tweak a few other things, this one works.



data hash3 (drop=rc);
length postl_cd $5;
declare AssociativeArray hh () ;
rc = hh.DefineKey ( 'addr_key' ) ;
rc = hh.DefineData ( 'addr_key','postl_cd') ;
rc = hh.DefineDone () ;
do until ( eof1 ) ;
set provzip end = eof1 ;
rc = hh.add () ;
end ;
do until ( eof2 ) ;
set address end = eof2 ;
rc = hh.find () ;
if rc ^=0 then postl_cd=' ';
output;
end ;
run ;

proc print data=hash3;
title 'hash3';
run;
Super Contributor
Posts: 474

Re: Merge/Hash/SQL Join

Hi.

Use the dataset option in the object construct, and provide the dataset name you whish to load into the hash.
Then loose all the EOFs and until loops. You just need to read an observation from one dataset and match with the other stored in the hash.

Be aware that until 9.2, the dataset loaded into the hash should have unique keys.

Check the online doc here:
http://support.sas.com/documentation/cdl/en/lrdict/62618/HTML/default/a002576871.htm

Cheers from Portugal.

Daniel Santos @ www.cgd.pt
Ask a Question
Discussion stats
  • 4 replies
  • 175 views
  • 0 likes
  • 4 in conversation