03-02-2018 11:41 AM
Does anyone have a good list of options to change to improve performance of uploading data to Redshift when connection is made using SAS/Access to ODBC. That is on a system that does not have a license for SAS/Access to Redshift.
I tried uploading a small dataset and it took 34 minutes!!
644 +data RWATEAM.__0001 ; 645 + set fi.DEMOGRAPHICS (obs=&obs); 646 +run; NOTE: SAS variable labels, formats, and lengths are not written to DBMS tables. NOTE: There were 15117 observations read from the data set FI.DEMOGRAPHICS. NOTE: The data set RWATEAM.__0001 has 15117 observations and 9 variables. NOTE: DATA statement used (Total process time): real time 34:26.78 cpu time 3.28 seconds
03-02-2018 01:46 PM
Which ODBC driver are you using? If your using ODBC Driver from DataDirect for amazon Redshift then you can Bulk load.
Use PROC APPEND with INSERTBUFFER dataset option. A larger buffer size can increase I/O performance over a small size when you write data to Redshift from DataDirect Driver.
Export the data into CSV file using PROC EXPORT and save the file in Local Disk and next upload the file into your buckets in Amazon S3 (Using Amazon S3 Console). Now run a pass-through in SAS (or any other SQL client) for copy command to load.
proc sql; connect to odbc as mycon (datasrc='RH_Connect' user=xxxx pwd=xxxxx); execute (copy test from 's3://mybucket/test.csv' credentials 'aws_access_key_id=xxxx; aws_secret_access_key=xxxxxx';) by mycon; disconnect from mycon; quit;
03-02-2018 03:14 PM
I wonder if anyone has tried using SAS to push files to S3 buckets?
Are there command line Unix commands you could call using the PIPE engine?
Are there SAS fileref engines that would allow you to write the CSV file directly to the S3 bucket?
Are there Unix drivers to allow the S3 bucket to be mounted as folder in the Unix file system?
03-02-2018 07:38 PM
Not sure if this is relevant in your case but have you tried the DBCOMMIT option? Setting DBCOMMIT = 0 is supposed to ensure there is only 1 database commit done for your all of your table inserts.
03-03-2018 05:39 AM
Try add some options.
options dbidirectexec bufno=100 bufsize=128k ; LIBNAME SQL ODBC DATAsrc=proddb SCHEMA=sas USER=user PASSWORD="*******" INSERT_SQL=YES INSERTBUFF=32767 readbuff=10000 bulkload=yes dbcommit=10000 ;