BookmarkSubscribeRSS Feed
aasdfafafsdfsaf
Calcite | Level 5

Hi everyone,

 

Im working with large datasets on a Oracle server and i was wondering what was the most efficient way of finding observations in one dataset [A] (100m rows,20columns) in another dataset column [B] (3m rows)

 

the code that i have right now is extremely slow and puts a heavy strain on the server, not to mention the file size limitation.

here is the code i have, that i want to make faster

 

proc sql;

create table want_table

as select * from table_a where

table_a_col in (select col_in_table_b);

quit;

 

sorry if it's vague. I'll try to be as clear as possible in the comments.

7 REPLIES 7
Reeza
Super User
Are both these tables on the Oracle Server or is one local?

And is your Oracle server indexed on table_a_col?

That query looks incomplete so showing actual code is likely useful. If you're using tables from different systems, ie Oracle and SAS or different Oracle DB then this will be slow.
aasdfafafsdfsaf
Calcite | Level 5

both are on Oracle and it is not indexed on the col_a.

there are many different columns that i would have to look for using this where col_a in (select col_b from table_b) so i dont know if can reindex on multiple columns.

sorry for the wrong code. still relatively new to SAS

ChrisHemedinger
Community Manager

If these are in fact two different databases, then SAS will be forced to move eligible records from each into the SAS session in order to complete the JOIN.

 

If these are two tables from the same Oracle database, or same server but different schemas, then might be better to use PROC SQL CONNECT to use explicit passthrough -- sending the commands directly to the database for processing -- rather than allowing SAS PROC SQL to try and figure out how to satisfy the query.

 

If that's not possible, then it might be better to bring the smaller table into SAS (or the subset you are likely to need), and upload it to a scratch area on the other database, then join those to get the final result.

SAS For Dummies 3rd Edition! Check out the new edition, covering SAS 9.4, SAS Viya, and all of the modern ways to use SAS!
aasdfafafsdfsaf
Calcite | Level 5

they will be both in the same oracle server or one local(3m) and one on the oracle server(100M). is there anything specific i can google to get a more precise search result for this issue? thanks!

ChrisHemedinger
Community Manager

If the data is in the same Oracle server, then explicit passthrough will ensure all work is done in the database and will minimize the I/O back to the SAS session.

 

If you have one table local and one table remote, then that's a recipe for a slow query.  SAS is forced to bring ALL records into the SAS session to compare/filter for the join.  If you can upload the smaller table (or a subset) to the database then use PROC SQL to join there, that's best. If the join is on a small subset of key values, you might even be able to create a nested query filter (using the IN operator) with those literal values to subset the larger remote table. You would first build those literal values with a PROC SQL on the local table and SELECT INTO a macro variable, then use that as part of the SQL you send to the remote database.

SAS For Dummies 3rd Edition! Check out the new edition, covering SAS 9.4, SAS Viya, and all of the modern ways to use SAS!
aasdfafafsdfsaf
Calcite | Level 5
Thank you!
I'll try to implement this first and get back
ChrisNZ
Tourmaline | Level 20

Also  in (select col_in_table_b) can be a lot more expensive than an inner join if the parser does not convert to a join for you, as the whole table B is read for each row in table A.

Submit a query with an inner join.

 

hackathon24-white-horiz.png

2025 SAS Hackathon: There is still time!

Good news: We've extended SAS Hackathon registration until Sept. 12, so you still have time to be part of our biggest event yet – our five-year anniversary!

Register Now

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 7 replies
  • 1938 views
  • 2 likes
  • 4 in conversation