Re: How to tune SAS connection to Redshift via ODBC

Tom · Posted 03-02-2018 11:41 AM

Does anyone have a good list of options to change to improve performance of uploading data to Redshift when connection is made using SAS/Access to ODBC. That is on a system that does not have a license for SAS/Access to Redshift.

I tried uploading a small dataset and it took 34 minutes!!

644  +data RWATEAM.__0001 ;
645  +  set fi.DEMOGRAPHICS (obs=&obs);
646  +run;

NOTE: SAS variable labels, formats, and lengths are not written to DBMS tables.
NOTE: There were 15117 observations read from the data set FI.DEMOGRAPHICS.
NOTE: The data set RWATEAM.__0001 has 15117 observations and 9 variables.
NOTE: DATA statement used (Total process time):
      real time           34:26.78
      cpu time            3.28 seconds

SuryaKiran · Posted 03-02-2018 01:46 PM

Hi Tom,

Method:1

Which ODBC driver are you using? If your using ODBC Driver from DataDirect for amazon Redshift then you can Bulk load.

Use PROC APPEND with INSERTBUFFER dataset option. A larger buffer size can increase I/O performance over a small size when you write data to Redshift from DataDirect Driver.

Method:2

Export the data into CSV file using PROC EXPORT and save the file in Local Disk and next upload the file into your buckets in Amazon S3 (Using Amazon S3 Console). Now run a pass-through in SAS (or any other SQL client) for copy command to load.

proc sql;
connect to odbc as mycon (datasrc='RH_Connect'
 user=xxxx
pwd=xxxxx);
execute (copy test from 's3://mybucket/test.csv'
 credentials 'aws_access_key_id=xxxx;
 aws_secret_access_key=xxxxxx';)
 by mycon;
disconnect from mycon;
quit;

Thanks,
Suryakiran

Tom · Posted 03-02-2018 03:14 PM

Thanks.

I wonder if anyone has tried using SAS to push files to S3 buckets?

Are there command line Unix commands you could call using the PIPE engine?

Are there SAS fileref engines that would allow you to write the CSV file directly to the S3 bucket?

Are there Unix drivers to allow the S3 bucket to be mounted as folder in the Unix file system?

smartiVP · Posted 07-03-2018 01:57 PM

Hello,

I have used aws cli interface, which I downloaded from AWS.

once it's configured on the machine which houses SAS you can use a similar statement like the below:

x "aws s3 cp d:/filename s3://my-bucket/filename";

You may have to have your X command ability turned on.

SASKiwi · Posted 03-02-2018 07:38 PM

Not sure if this is relevant in your case but have you tried the DBCOMMIT option? Setting DBCOMMIT = 0 is supposed to ensure there is only 1 database commit done for your all of your table inserts.

Ksharp · Posted 03-03-2018 05:39 AM

@Tom

Try add some options.

options dbidirectexec bufno=100 bufsize=128k ;
LIBNAME SQL ODBC DATAsrc=proddb SCHEMA=sas USER=user PASSWORD="*******" INSERT_SQL=YES 
INSERTBUFF=32767
readbuff=10000
bulkload=yes
dbcommit=10000
;

How to tune SAS connection to Redshift via ODBC