## Hash Table

Occasional Contributor
Posts: 16

# Hash Table

Hi guys,

I have 2 Tables with a One-To-Many Relationship.

The first Table, lets say Table a, is a big Table which contains the columns Date, customerID and Revenue.

The second Table b contains Date, customerID, wealth and adress. but only of a few (premium) rich customers.

It´s possible, that we have more than one Observations with the same Date and Customer ID in Table 1 (If he buys more than one products in one month),

in Table b, only one Observation for each Date/CustomerID-Combination is possible.

So, the Tables could look like this:

data a;
input date \$ customerId \$ Revenue \$ ;
datalines;
01JAN01 1 10

01JAN01 1 20

01JAN01 2 30
01FEB01 1 400

01FEB01 2 50

01FEB01 3 60

01FEB01 4 30

01MAR01 1 60

01MAR01 2 30

01APR01 1 60

01APR01 2 60

01MAY01 1 800

01MAY01 2 608

01MAY01 2 67

01MAY01 4 65

01MAY01 5 644
;
run;

And Table b could look like this:

data b;
input date \$ customerId \$ wealth \$ Adress \$;
datalines;
01JAN01 1 1000 Stockholm

01JAN01 2 2000 Paris
01FEB01 1 4000 Munich

01MAR01 1 6000 New York

01MAR01 2 3000 Cologne

01MAR01 3 3000 London
;
run;

Now, I want to make an Inner Join to join both Tables when a.Date = b.Date and a.CustomerID = b.CustomerID.

The final table should have the following columns: date, customerID, Revenue and wealth.

In case that the criteria is met for more combinations of table a, the desired table should contain all rows.

So in this Scenario, the final table should look like this:

 Date CustomerID Revenue Wealth Jan 01 1 10 100000 Jan 01 1 20 100000 Jan 01 2 30 300000 Feb 01 1 30 120000 Mrz 01 1 510 130000 Mrz 01 2 10 200000 Mrz 01 3 10 500000

This would not be a Problem with SQL, but my Professor wants that I use a Hash-Table, because for large datasets, this would be more efficient.

Could someone kindly provide me a Hash Solution for this?

Note: This is just a simplified example that I´ve made. Pls don´t care about the sense of the data, let´s say rich customers can Switch their adress and wealth every month or can be banned from the premium list in every month

With the kindest regards,

Alex

Super User
Posts: 9,227

## Re: Hash Table

We are not really here to answer your course questions.  Investigate and learn it and come back with any specific questions:

https://www.sas.com/content/dam/SAS/en_ca/User%20Group%20Presentations/Regina-User-Group/ShaunKaufma...

Occasional Contributor
Posts: 16

## Re: Hash Table

Well,this is not really a course, It´s a part of my Master Thesis.

I´ve tried to solve my Problem but unfortunately I´m not very familiar with Hash Tables,

I´m only able to solve the Problem with SQL.

So, If anyone else could help me I´d really appreciate it.

Super Contributor
Posts: 500

## Re: Hash Table

Have you read the documentation? Using hash tables is well explained in numerous papers, e.g. Introduction to SAS® Hash Objects - SAS Support

Super User
Posts: 6,543

## Re: Hash Table

Just to add a little bit of a framework, here are the things you will need to learn.

How to load table B into a hash table, where the combination of CustomerID and Date are the keys, and wealth and address are the data elements.

In the same DATA step, read in observations from table A.  For each, see if the hash table contains a match.  If so, retrieve the matching data from the hash table and output the observation.  (If no match, do nothing.)

Occasional Contributor
Posts: 16

## Re: Hash Table

Thanks for your help. Ive tried this, but I´m not really sure if this is the right solution for my Problem.

How can I define that I want a one-to-many relationship?

data desired;

if 0 then set work.b;   /* loading in my small table b*/

declare hash VS(dataset:”work.b”);

rc=VS.defineKey(“date”,”customerid”); /*key = date & customer id */

rc=VS.defineData(“date”,”customerid”, “wealth”); /*without address, cause I don’t want it*/

rc=VS.definedone();

do until(eof);

call missing(date, customerid, revenue); /*Is This right? I don’t really get what call missing does to be honest */

rc=VS.find();

if rc=0 then output; /*because I want an inner join */

end;

stop;

run;

Super Contributor
Posts: 500

## Re: Hash Table

[ Edited ]

Did the result match your expectation?

Some minor fixes, but untested:

``````data desired;
if 0 then set work.b(keep=date customerid wealth);   /* loading in my small table b*/

if _n_ = 1 then do;
declare hash VS(dataset:"work.b");
rc=VS.defineKey("date","customerid"); /*key = date & customer id */
rc=VS.defineData("wealth"); /*without address, cause I don’t want it */
rc=VS.definedone();

call missing(wealth);
end;

set work.a;

if vs.find() = 0;
run;``````

Edit: Added keep-option to  if 0 then set work.b

PROC Star
Posts: 1,351

## Re: Hash Table

@mrzlatan91wrote:

So in this Scenario, the final table should look like this:

 Date CustomerID Revenue Wealth Jan 01 1 10 100000 Jan 01 1 20 100000 Jan 01 2 30 300000 Feb 01 1 30 120000 Mrz 01 1 510 130000 Mrz 01 2 10 200000 Mrz 01 3 10 500000

This would not be a Problem with SQL, but my Professor wants that I use a Hash-Table, because for large datasets, this would be more efficient.

Could someone kindly provide me a Hash Solution for this?

Note: This is just a simplified example that I´ve made. Pls don´t care about the sense of the data, let´s say rich customers can Switch their adress and wealth every month or can be banned from the premium list in every month

With the kindest regards,

Alex

Good morning, I fail to understand why your wealth values in your final table are way off than what it is in data b?

Occasional Contributor
Posts: 16

## Re: Hash Table

Dear novinosrin,

sorry, my mistake. I tried to make an other example before, but switched to the current one, but I may have copied the wrong table.

Table b Looks like this:

data b;
input date \$ customerId \$ wealth \$ Adress \$;
datalines;
01JAN01 1 100000 Stockholm

01JAN01 2 300000 Paris
01FEB01 1 120000 Munich

01MAR01 1 130000 New York

01MAR01 2 200000 Cologne

01MAR01 3 500000 London
;
run;

PROC Star
Posts: 1,351

## Re: Hash Table

@mrzlatan91 No worries mate. It appears @andreas_lds has given you the solution. Have you tested it?

Occasional Contributor
Posts: 16

## Re: Hash Table

Unfortunately, I can only test it on friday.

I´ll tell you then if it works.

But thanks mate

Occasional Contributor
Posts: 5