BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
Hima
Obsidian | Level 7

I am trying to create a hash table for the below criteria. Please help.

Table_A
ID Name Fee_Amt
1 a 0
2 b 200
3 c 0
4 d 400
5 e 500

Table_B
ID City
1 Test1
2 Test2
3 Test3
4 Test4
5 Test5

Result
ID Name City      Fee_Amt      Fee
1      a      Test1    0                No
2      b      Test2   200             Yes
3      c      Test3    0                No
4      d      Test4   400             Yes
5      e      Test5   500             Yes

Usually, in PROC SQL the below code works good but it takes hours to run if the observations are in millions. I happened to know that Hash tables are effective in this case. If some can help me creating a hash table for left join, it would be a great help. Thank you.

Proc SQL Code:
PROC SQL;
CREATE TABLE RESULT AS
SELECT A.*, B.CITY, CASE WHEN (A.ID = B.ID AND A.FEE_AMT > 0) THEN 'YES' ELSE 'NO' END AS FEE
FROM TABLE_A LEFT JOIN TABLE_B
ON A.ID = B.ID;
QUIT;

1 ACCEPTED SOLUTION

Accepted Solutions
Haikuo
Onyx | Level 15

Hi Hima,

I am new to hash as well, so just to start discussion. Assume that your table_b is smaller that you want to put it into hash(), also you want a merge(inner join), not a left join showing in your SQL code.

data Table_A;

input ID Name $ Fee_Amt;

cards;

1 a 0

2 b 200

3 c 0

4 d 400

5 e 500

6 t 100

;

data Table_B;

input ID City $;

cards;

1 Test1

2 Test2

3 Test3

4 Test4

5 Test5

;

data want;

  if 0 then set table_b;

  if _n_=1 then do;

  declare hash hh(dataset: 'table_b');

  hh.definekey ('id');

  hh.definedata('city');

  hh.definedone();

  end;

  set table_a;

  fee=ifc(fee_amt=0,'NO','YES');

  _n_=hh.find();

  if _n_=0 then output;

  run;

  proc print;run;

Regards,

Haikuo

View solution in original post

5 REPLIES 5
Haikuo
Onyx | Level 15

Hi Hima,

I am new to hash as well, so just to start discussion. Assume that your table_b is smaller that you want to put it into hash(), also you want a merge(inner join), not a left join showing in your SQL code.

data Table_A;

input ID Name $ Fee_Amt;

cards;

1 a 0

2 b 200

3 c 0

4 d 400

5 e 500

6 t 100

;

data Table_B;

input ID City $;

cards;

1 Test1

2 Test2

3 Test3

4 Test4

5 Test5

;

data want;

  if 0 then set table_b;

  if _n_=1 then do;

  declare hash hh(dataset: 'table_b');

  hh.definekey ('id');

  hh.definedata('city');

  hh.definedone();

  end;

  set table_a;

  fee=ifc(fee_amt=0,'NO','YES');

  _n_=hh.find();

  if _n_=0 then output;

  run;

  proc print;run;

Regards,

Haikuo

Hima
Obsidian | Level 7

Thank you. This is exactly what I need. Great Expertise!!!!

MikeZdeb
Rhodochrosite | Level 12

hi ... you can also do some reading (my experience with these is that I think that this order works best) ...

Getting Started with the DATA Step Hash Object

http://support.sas.com/rnd/base/datastep/dot/hash-getting-started.pdf

Getting Started with the DATA Step Hash Iterator

http://support.sas.com/rnd/base/datastep/dot/iterator-getting-started.pdf

How Do I Love Hash Tables? Let Me Count The Ways!

http://www2.sas.com/proceedings/forum2008/029-2008.pdf

Better Hashing in SAS 9.2

http://support.sas.com/resources/papers/sgf2008/hashing92.pdf

Hash Crash and Beyond

http://www2.sas.com/proceedings/forum2008/037-2008.pdf

and this is nice to have around ...

Hash Object Tip Sheet

http://support.sas.com/rnd/base/datastep/dot/hash-tip-sheet.pdf

sassharp
Calcite | Level 5

can anybody give example and code for how to incorporate an option 'NODUPKEY' in hash sort as in proc sort with nodupkey?

thanks.

Ksharp
Super User

By default, Hash Table is NODUPKEY, it keeps the first one of duplicated key.

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

What is Bayesian Analysis?

Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 5 replies
  • 1024 views
  • 3 likes
  • 5 in conversation