BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
changxuosu
Quartz | Level 8

Hi SAS communities,

I've heard the power of "in-memory" table look-up presented by hash tables.

Having no backgrounds in programming, I still didn't understand the advantage of hash tables after reading a bunch of sas papers online about hash table.

 

Now I come across a problem in SAS coding and just wondering whether this problem can be solved by hash tables.

 

I'm soliciting general ideas, so no detailed sas code and data sets are available for me to provide. I have no idea how to write hash table code on this problem. I'm just curious whether this problem can be solved by hash tables.

 

Here is the problem:

I have two datasets A and B. Both datasets have the same variables, for example, acct and revenue. The only difference is that dataset A has 1milion observations and dataset B has 1milllion plus 20 observations that are not in dataset A.

I want to find who these 20 additional observations are in dataset B that can not be found in dataset A.

 

I write a proc sql code like the following to approach this problem:

 

proc sql;

select *

from datasetB

where acct_key not in ( select distinct acct_key from datasetA);

quit;

 

The problem is that, with 1million and 1million+20 observations in each dataset, this piece of code takes insane amount of time for SAS to finish the job. My sas even crushes once and runs forever. Wondering how Hash table can help in this case?? How can the magical "in-memory table look up" function possessed by hash table help me to cut down the running time and do the job more efficiently?

 

Thanks in advance to all dear sas community members. I'm indebted to your thoughts/inputs/ideas/suggestions.

 

 

 

 

1 ACCEPTED SOLUTION

Accepted Solutions
PeterClemmensen
Tourmaline | Level 20

Sorry, the key variable should be in quotation marks like this. How about that?

 

data want;
   if 0 then set A;
   if _N_ = 1 then do;
      declare hash h(dataset:'A');
      h.defineKey('acct_key');
      h.defineDone();
   end;

   set B;

   if h.check() ne 0;
run;

View solution in original post

7 REPLIES 7
PeterClemmensen
Tourmaline | Level 20

Sounds like a nice job for a hash object. As you mention, we can't see your data, so needless to say, this code is untested but see if this gives you what you want

 

data want;
   if 0 then set A;
   if _N_ = 1 then do;
      declare hash h(dataset:'A');
      h.defineKey(acct_key);
      h.defineDone();
   end;

   set B;

   if h.check() ne 0;
run;

 

Before explaining this 'magical' in-memory technique, let me know if this works for you 🙂 An absolute requirement for it to work is that the data in the hash object fits in memory.

PeterClemmensen
Tourmaline | Level 20

@changxuosu, did the above solve your problem? 🙂

changxuosu
Quartz | Level 8
@PeterClemmensen I'm sorry it took a while to get back to you. I was kept from unexpected deadlines. Thank you so much for your suggestion. I tried it and still trying to understand it.
I met the following error when tried it, so that you know.

ERROR: Undeclared key symbol for hash object at line 31 column 7.
ERROR: DATA STEP Component Object failure. Aborted during the EXECUTION phase.

PeterClemmensen
Tourmaline | Level 20

Sorry, the key variable should be in quotation marks like this. How about that?

 

data want;
   if 0 then set A;
   if _N_ = 1 then do;
      declare hash h(dataset:'A');
      h.defineKey('acct_key');
      h.defineDone();
   end;

   set B;

   if h.check() ne 0;
run;
changxuosu
Quartz | Level 8
thank you so much for your followup. It worked greatly 🙂
PeterClemmensen
Tourmaline | Level 20

No problem. I'm glad you found your answer. Do you have any questions about the code?

changxuosu
Quartz | Level 8
I don't have any questions for now. Have a nice day, warm hearted SAS expert 🙂

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

Mastering the WHERE Clause in PROC SQL

SAS' Charu Shankar shares her PROC SQL expertise by showing you how to master the WHERE clause using real winter weather data.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 7 replies
  • 1761 views
  • 3 likes
  • 2 in conversation