BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
Hima
Obsidian | Level 7

I am trying to create a hash table for the below criteria. Please help.

Table_A
ID Name Fee_Amt
1 a 0
2 b 200
3 c 0
4 d 400
5 e 500

Table_B
ID City
1 Test1
2 Test2
3 Test3
4 Test4
5 Test5

Result
ID Name City      Fee_Amt      Fee
1      a      Test1    0                No
2      b      Test2   200             Yes
3      c      Test3    0                No
4      d      Test4   400             Yes
5      e      Test5   500             Yes

Usually, in PROC SQL the below code works good but it takes hours to run if the observations are in millions. I happened to know that Hash tables are effective in this case. If some can help me creating a hash table for left join, it would be a great help. Thank you.

Proc SQL Code:
PROC SQL;
CREATE TABLE RESULT AS
SELECT A.*, B.CITY, CASE WHEN (A.ID = B.ID AND A.FEE_AMT > 0) THEN 'YES' ELSE 'NO' END AS FEE
FROM TABLE_A LEFT JOIN TABLE_B
ON A.ID = B.ID;
QUIT;

1 ACCEPTED SOLUTION

Accepted Solutions
Haikuo
Onyx | Level 15

Hi Hima,

I am new to hash as well, so just to start discussion. Assume that your table_b is smaller that you want to put it into hash(), also you want a merge(inner join), not a left join showing in your SQL code.

data Table_A;

input ID Name $ Fee_Amt;

cards;

1 a 0

2 b 200

3 c 0

4 d 400

5 e 500

6 t 100

;

data Table_B;

input ID City $;

cards;

1 Test1

2 Test2

3 Test3

4 Test4

5 Test5

;

data want;

  if 0 then set table_b;

  if _n_=1 then do;

  declare hash hh(dataset: 'table_b');

  hh.definekey ('id');

  hh.definedata('city');

  hh.definedone();

  end;

  set table_a;

  fee=ifc(fee_amt=0,'NO','YES');

  _n_=hh.find();

  if _n_=0 then output;

  run;

  proc print;run;

Regards,

Haikuo

View solution in original post

5 REPLIES 5
Haikuo
Onyx | Level 15

Hi Hima,

I am new to hash as well, so just to start discussion. Assume that your table_b is smaller that you want to put it into hash(), also you want a merge(inner join), not a left join showing in your SQL code.

data Table_A;

input ID Name $ Fee_Amt;

cards;

1 a 0

2 b 200

3 c 0

4 d 400

5 e 500

6 t 100

;

data Table_B;

input ID City $;

cards;

1 Test1

2 Test2

3 Test3

4 Test4

5 Test5

;

data want;

  if 0 then set table_b;

  if _n_=1 then do;

  declare hash hh(dataset: 'table_b');

  hh.definekey ('id');

  hh.definedata('city');

  hh.definedone();

  end;

  set table_a;

  fee=ifc(fee_amt=0,'NO','YES');

  _n_=hh.find();

  if _n_=0 then output;

  run;

  proc print;run;

Regards,

Haikuo

Hima
Obsidian | Level 7

Thank you. This is exactly what I need. Great Expertise!!!!

MikeZdeb
Rhodochrosite | Level 12

hi ... you can also do some reading (my experience with these is that I think that this order works best) ...

Getting Started with the DATA Step Hash Object

http://support.sas.com/rnd/base/datastep/dot/hash-getting-started.pdf

Getting Started with the DATA Step Hash Iterator

http://support.sas.com/rnd/base/datastep/dot/iterator-getting-started.pdf

How Do I Love Hash Tables? Let Me Count The Ways!

http://www2.sas.com/proceedings/forum2008/029-2008.pdf

Better Hashing in SAS 9.2

http://support.sas.com/resources/papers/sgf2008/hashing92.pdf

Hash Crash and Beyond

http://www2.sas.com/proceedings/forum2008/037-2008.pdf

and this is nice to have around ...

Hash Object Tip Sheet

http://support.sas.com/rnd/base/datastep/dot/hash-tip-sheet.pdf

sassharp
Calcite | Level 5

can anybody give example and code for how to incorporate an option 'NODUPKEY' in hash sort as in proc sort with nodupkey?

thanks.

Ksharp
Super User

By default, Hash Table is NODUPKEY, it keeps the first one of duplicated key.

sas-innovate-2024.png

Don't miss out on SAS Innovate - Register now for the FREE Livestream!

Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.

 

Register now!

What is Bayesian Analysis?

Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 5 replies
  • 1031 views
  • 3 likes
  • 5 in conversation