BookmarkSubscribeRSS Feed
aasdfafafsdfsaf
Calcite | Level 5

Hi everyone,

 

Im working with large datasets on a Oracle server and i was wondering what was the most efficient way of finding observations in one dataset [A] (100m rows,20columns) in another dataset column [B] (3m rows)

 

the code that i have right now is extremely slow and puts a heavy strain on the server, not to mention the file size limitation.

here is the code i have, that i want to make faster

 

proc sql;

create table want_table

as select * from table_a where

table_a_col in (select col_in_table_b);

quit;

 

sorry if it's vague. I'll try to be as clear as possible in the comments.

7 REPLIES 7
Reeza
Super User
Are both these tables on the Oracle Server or is one local?

And is your Oracle server indexed on table_a_col?

That query looks incomplete so showing actual code is likely useful. If you're using tables from different systems, ie Oracle and SAS or different Oracle DB then this will be slow.
aasdfafafsdfsaf
Calcite | Level 5

both are on Oracle and it is not indexed on the col_a.

there are many different columns that i would have to look for using this where col_a in (select col_b from table_b) so i dont know if can reindex on multiple columns.

sorry for the wrong code. still relatively new to SAS

ChrisHemedinger
Community Manager

If these are in fact two different databases, then SAS will be forced to move eligible records from each into the SAS session in order to complete the JOIN.

 

If these are two tables from the same Oracle database, or same server but different schemas, then might be better to use PROC SQL CONNECT to use explicit passthrough -- sending the commands directly to the database for processing -- rather than allowing SAS PROC SQL to try and figure out how to satisfy the query.

 

If that's not possible, then it might be better to bring the smaller table into SAS (or the subset you are likely to need), and upload it to a scratch area on the other database, then join those to get the final result.

Learn from the Experts! Check out the huge catalog of free sessions in the Ask the Expert webinar series.
aasdfafafsdfsaf
Calcite | Level 5

they will be both in the same oracle server or one local(3m) and one on the oracle server(100M). is there anything specific i can google to get a more precise search result for this issue? thanks!

ChrisHemedinger
Community Manager

If the data is in the same Oracle server, then explicit passthrough will ensure all work is done in the database and will minimize the I/O back to the SAS session.

 

If you have one table local and one table remote, then that's a recipe for a slow query.  SAS is forced to bring ALL records into the SAS session to compare/filter for the join.  If you can upload the smaller table (or a subset) to the database then use PROC SQL to join there, that's best. If the join is on a small subset of key values, you might even be able to create a nested query filter (using the IN operator) with those literal values to subset the larger remote table. You would first build those literal values with a PROC SQL on the local table and SELECT INTO a macro variable, then use that as part of the SQL you send to the remote database.

Learn from the Experts! Check out the huge catalog of free sessions in the Ask the Expert webinar series.
aasdfafafsdfsaf
Calcite | Level 5
Thank you!
I'll try to implement this first and get back
ChrisNZ
Tourmaline | Level 20

Also  in (select col_in_table_b) can be a lot more expensive than an inner join if the parser does not convert to a join for you, as the whole table B is read for each row in table A.

Submit a query with an inner join.

 

SAS Innovate 2025: Register Now

Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 7 replies
  • 1140 views
  • 2 likes
  • 4 in conversation