SAS Programming

DATA Step, Macro, Functions and more
BookmarkSubscribeRSS Feed
milts
Pyrite | Level 9

Hi experts,

Is it true that hashtables performs faster compared with SQL lookups? What are other advantages or limitations? I have no environment to test this out yet.

I'm also curious whether joins performed in-database performs faster versus sas joins. Example given these example:

create table result as select a.key,a.key2,b.var from a left join b on(a.key2 = b.key);

I'm using a's key2 to link with b to lookup the value var.

Kindly advise if explanation is confusing.

Thanks!

Milton

3 REPLIES 3
Ksharp
Super User

Sorry, I can't offer you more information. Maybe you should contact SAS Technical Support .

As far as I know Hash Table is the fastest way to execute querying, so I prefer to Hash Table if I want speed,

I would prefer to SQL if I want to save some words.

Ksharp

LaurentdeWalick
Fluorite | Level 6

Hash is the fastest method for looking up data, since the data is read just once from disk and then loaded into memory. There is also no need to sort or index that table in advance. The most important limitation is that memory available to SAS is enough to fit the hash table. There are situations a PROC SQL might be as fast as a self coded hash lookup, because the SQL optimizer will do a hashjoin if it determines the lookup table fits into memory.

In-database also also usually faster than SAS joins and might even be faster than hash lookup. With in-database joins the SAS system does not need to bring over all data from the database to SAS first, before the join can be excuted. A join done in-database will only transfer the results, resulting in less movement of data. The SQL optimizer tries to pass as much processing as possible to the database, but there are situations it does this less optimal than you can do by hand.

SASKiwi
PROC Star

There is another lookup technique available in SAS that is in many cases comparable in performance to hash and that is the lookup format. With lookup formats the lookup is built in PROC FORMAT before the DATA step you use it in, whereas with hash you have to do it all in the same DATA step. I find being able to separate out the lookup an advantage if you are doing lots of them. I often do more than 10 lookups, sometimes over 20 in the same DATA step. The coding using lookup formats is more compact.

sas-innovate-white.png

Special offer for SAS Communities members

Save $250 on SAS Innovate and get a free advance copy of the new SAS For Dummies book! Use the code "SASforDummies" to register. Don't miss out, May 6-9, in Orlando, Florida.

 

View the full agenda.

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 3 replies
  • 4082 views
  • 0 likes
  • 4 in conversation