BookmarkSubscribeRSS Feed
MZunnurain
Calcite | Level 5
Hello,

I'm trying to learn some hash object....The original was written in a merge, I tried proc sql and it is acceptable using left join, just wondering how to get the left join using hash object..


data address;
addr_key=10000001;
address='test address1';
output;
addr_key=10000002;
address='test address2';
output;
addr_key=10000003;
address='test address3';
output;
run;

proc print u;
title 'address';
run;


data provzip;
addr_key=10000001;
postl_cd='87048';
output;
addr_key=10000002;
postl_cd='39735';
output;
run;

proc print u;
title 'provzip';
run;


data hash1 (drop=rc);
length postl_cd $5;
declare AssociativeArray hh () ;
rc = hh.DefineKey ( 'addr_key' ) ;
rc = hh.DefineData ( 'addr_key', 'address' ) ;
rc = hh.DefineDone () ;
call missing(postl_cd);
do until ( eof1 ) ;
set address end = eof1 ;
rc = hh.add () ;
end ;
do until ( eof2 ) ;
set provzip end = eof2 ;
rc = hh.find () ;
if rc ^=0 then postl_cd=' ';
if rc =0 then output;
end ;
run ;

proc print data=hash1;
title 'hash1';
run;

data hash2;
if _n_ = 1 then do;
if 0 then set address;
declare hash lookup(dataset:'address');
lookup.defineKey('addr_key');
lookup.definedata('address');
lookup.defineDone();
end;

set provzip;
if lookup.find(key:addr_key) ne 0 then postl_cd=' ';
run;


PROC PRINT;
title 'hash2';
RUN;

proc sql;
create table sql1 as
select *
from address a
left join provzip b
on a.addr_key=b.addr_key
;
quit;
run;

proc print u;
title 'sql';
run;
4 REPLIES 4
sbb
Lapis Lazuli | Level 10 sbb
Lapis Lazuli | Level 10
Have you searched the SAS support http://support.sas.com/ website.

Scott Barry
SBBWorks, Inc.

Suggested Google advanced search argument, this topic/post:

left join replace with hash object site:sas.com
deleted_user
Not applicable
You seem to have the basic idea. To get what you want use the logic of the hash1 (or hash2) but put the provzip data set into the hash table and iterate over the address data set.

The logic you're using only gets rows for which you have a postal code; doing it the other way will get all addresses, with postal code missing if the lookup fails.

Use:

> if rc ^=0 then postl_cd=' ';
> if rc =0 then output;

but take the "if rc = 0" off the second statement.

Note: it's usual with a hash lookup like this to put the small data set into the hash table. This beats merge by not requiring the large data set to be sorted. Proc SQL is a viable alternative; only a test will tell which is better.

Cheers,

Jonathan


> Hello,
>
> I'm trying to learn some hash object....The original
> was written in a merge, I tried proc sql and it is
> acceptable using left join, just wondering how to get
> the left join using hash object..
>
>
> data address;
> addr_key=10000001;
> address='test address1';
> output;
> addr_key=10000002;
> address='test address2';
> output;
> addr_key=10000003;
> address='test address3';
> output;
> n;
>
> proc print u;
> title 'address';
> n;
>
>
> data provzip;
> addr_key=10000001;
> postl_cd='87048';
> output;
> addr_key=10000002;
> postl_cd='39735';
> output;
> run;
>
> proc print u;
> title 'provzip';
> un;
>
>
> data hash1 (drop=rc);
> length postl_cd $5;
> declare AssociativeArray hh () ;
> rc = hh.DefineKey ( 'addr_key' ) ;
> rc = hh.DefineData ( 'addr_key', 'address' ) ;
> rc = hh.DefineDone () ;
> call missing(postl_cd);
> do until ( eof1 ) ;
> set address end = eof1 ;
> rc = hh.add () ;
> end ;
> do until ( eof2 ) ;
> set provzip end = eof2 ;
> rc = hh.find () ;
> if rc ^=0 then postl_cd=' ';
> if rc =0 then output;
> end ;
> run ;
>
> proc print data=hash1;
> title 'hash1';
> n;
>
> data hash2;
> if _n_ = 1 then do;
> if 0 then set address;
> declare hash lookup(dataset:'address');
> lookup.defineKey('addr_key');
> lookup.definedata('address');
> lookup.defineDone();
> end;
>
> set provzip;
> if lookup.find(key:addr_key) ne 0 then postl_cd='
> ';
>
>
>
> PROC PRINT;
> title 'hash2';
> N;
>
> proc sql;
> create table sql1 as
> select *
> from address a
> left join provzip b
> on a.addr_key=b.addr_key
> ;
> quit;
> un;
>
> proc print u;
> title 'sql';
> n;
MZunnurain
Calcite | Level 5
Thank you.

I switched the datasets, and had to tweak a few other things, this one works.



data hash3 (drop=rc);
length postl_cd $5;
declare AssociativeArray hh () ;
rc = hh.DefineKey ( 'addr_key' ) ;
rc = hh.DefineData ( 'addr_key','postl_cd') ;
rc = hh.DefineDone () ;
do until ( eof1 ) ;
set provzip end = eof1 ;
rc = hh.add () ;
end ;
do until ( eof2 ) ;
set address end = eof2 ;
rc = hh.find () ;
if rc ^=0 then postl_cd=' ';
output;
end ;
run ;

proc print data=hash3;
title 'hash3';
run;
DanielSantos
Barite | Level 11
Hi.

Use the dataset option in the object construct, and provide the dataset name you whish to load into the hash.
Then loose all the EOFs and until loops. You just need to read an observation from one dataset and match with the other stored in the hash.

Be aware that until 9.2, the dataset loaded into the hash should have unique keys.

Check the online doc here:
http://support.sas.com/documentation/cdl/en/lrdict/62618/HTML/default/a002576871.htm

Cheers from Portugal.

Daniel Santos @ www.cgd.pt

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

What is Bayesian Analysis?

Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 4 replies
  • 953 views
  • 0 likes
  • 4 in conversation