BookmarkSubscribeRSS Feed
milts
Pyrite | Level 9

Hi experts,

Is it true that hashtables performs faster compared with SQL lookups? What are other advantages or limitations? I have no environment to test this out yet.

I'm also curious whether joins performed in-database performs faster versus sas joins. Example given these example:

create table result as select a.key,a.key2,b.var from a left join b on(a.key2 = b.key);

I'm using a's key2 to link with b to lookup the value var.

Kindly advise if explanation is confusing.

Thanks!

Milton

3 REPLIES 3
Ksharp
Super User

Sorry, I can't offer you more information. Maybe you should contact SAS Technical Support .

As far as I know Hash Table is the fastest way to execute querying, so I prefer to Hash Table if I want speed,

I would prefer to SQL if I want to save some words.

Ksharp

LaurentdeWalick
Fluorite | Level 6

Hash is the fastest method for looking up data, since the data is read just once from disk and then loaded into memory. There is also no need to sort or index that table in advance. The most important limitation is that memory available to SAS is enough to fit the hash table. There are situations a PROC SQL might be as fast as a self coded hash lookup, because the SQL optimizer will do a hashjoin if it determines the lookup table fits into memory.

In-database also also usually faster than SAS joins and might even be faster than hash lookup. With in-database joins the SAS system does not need to bring over all data from the database to SAS first, before the join can be excuted. A join done in-database will only transfer the results, resulting in less movement of data. The SQL optimizer tries to pass as much processing as possible to the database, but there are situations it does this less optimal than you can do by hand.

SASKiwi
PROC Star

There is another lookup technique available in SAS that is in many cases comparable in performance to hash and that is the lookup format. With lookup formats the lookup is built in PROC FORMAT before the DATA step you use it in, whereas with hash you have to do it all in the same DATA step. I find being able to separate out the lookup an advantage if you are doing lots of them. I often do more than 10 lookups, sometimes over 20 in the same DATA step. The coding using lookup formats is more compact.

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 3 replies
  • 3142 views
  • 0 likes
  • 4 in conversation