08-27-2015 11:50 PM
I use SAS/Access to Oracle to connect SAS to Oracle. I need to download Oracle data to SAS table. May I know which is the best method in term of speed:-
I have about 5mil records.
08-28-2015 09:07 AM
I will pick PROC DBLOAD.
But the speed of ETL is not about these PROC , it is about options in LIBNAMES , like you can add readbuff=10000 and bulkload=yes to fast it .
And Special System option: options compress=yes to decrease I/O overhead .
08-28-2015 11:19 AM
Downloading data without transformation doesn't add value.
I would chose SQL so that any transformations is pushed down prior to the download.
09-08-2015 11:12 AM
The SAS procedures (COPY, SQL, APPEND, etc.), and DATA step, will use the same mechanism to bring data from Oracle into SAS. SAS generates a query and ships it off to the database. WHERE clauses and many function calls will pass through to the database without you having to do anything special. There is an exception to this - SQL pass-through (aka. explicit pass-though). In this case you create database specific queries and SAS ships it, as -is, to the database.
You may want to take a look at the READBUFF= option. It may help with your query performance. Recent versions of SAS do a good job of setting this for you.
If your table is partitioned, you may want to look at the DBSLICE= option. To be honest, if your table is large enough to require partitioning, reading it into a SAS table is probably not a great idea.
Here is an explicit pass-thru example. I include it because the CREATE TABLE AS can be difficult to stumble onto.
CONNECT TO ORACLE(PATH=mypath USER=myuser PASSWORD=mypw);
CREATE TABLE mylib.mytable FROM CONNECTION TO ORACLE
(some Oracle specific SQL statement);
The DBLOAD procedure is used to load data from SAS into Oracle. DBLOAD isn't used much for new code since you can run bulkload using dataset and LIBNAME statement options. It's more of a legacy thing.
You can see the SQL being sent to the database using this systems option.
OPTION SASTRACE=',,,d' SASTRACELOC=saslog NOSTSUFFIX;
Ideally you want to ensure that joins, WHERE clauses and functions (mainly when used in a WHERE clause) are passed to the database.
Linus brings up a very good point. The real question is, why do you want to download an entire Oracle table into SAS? Typically, it isn't a great idea. It is best to let the database handle much of the work. There are situations where downloading a subset of data is appropriate. For example: I need to repeatedly process a subset of data and it is quicker, in the long run, to pull the data and then repeatedly process it.
You may find this SGF paper useful. It doesn't address your specific question, but it will help you learn about SQL query performance.
The SQL Tuning Checklist: Making Slow Database Queries a Thing of the Past