BookmarkSubscribeRSS Feed
nkm123
Calcite | Level 5

Does order of table in join condition matter in terms of performance? 

 

ex:- let's say table a has 10,000 records and table b has 1 MM records then which solution works better 

 

solution 1

 

proc sql;

select * from a  b ;

quit;

 

soluton 2 

 

proc sql;

select * from b a;

quit;

 

Thanks,

nkm

3 REPLIES 3
PGStats
Opal | Level 21
With only two tables involved, order makes no difference. SAS/SQL utilises a query optimizer that takes the sizes of the tables into account, among other factors.
PG
nkm123
Calcite | Level 5

But if this is nested loop join (no equi join condition so assume sql will use loop join algorithim) then wouldn't it be nice to have small table in right side and big table to left side.  By below process if have small table in Right will minimize less loop and less disk read. Is my understanding correct?

 

Nested Loop Join – When an equality condition is not specified, a read of the complete contents of the right table is processed for each row in the left table.

 

For these processes, a page of data is first read from the left table and then as much data as can fit in memory is
read from the right table. All merges are made (between any observation in the page from LeftT and all observations
in memory that came from RightT). The results are then output. Then the observations from RightT are flushed from
memory and a new read of RightT pulls in as many observations as can fit in memory. All possible matches are made
(between any observation in the page from LeftT and all observations in memory that came from RightT) and results
are output. This process continues until SQL has looped through all of the table RightT.

 

Then SQL takes a step in the left table and reads a new page of data from LeftT into memory. The process of
looping through right table is repeated for the second page from the left table. Then a third page is read from the left
table and the looping through the right hand table is performed again. The process continues until all the
observations in the left table have been read and matched against every observation in the right hand table.

 

ballardw
Super User

@nkm123 wrote:

But if this is nested loop join (no equi join condition so assume sql will use loop join algorithim) then wouldn't it be nice to have small table in right side and big table to left side.  


 

Since looping and such get involved then it is a good idea to reduce the size of as many tables as possible as early as possible.

You could find your code repeatedly selecting the same records in a loop which can lead to serious performance issues.

 

 

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 3 replies
  • 1291 views
  • 1 like
  • 3 in conversation