We’re smarter together. Learn from this collection of community knowledge and add your expertise.

Redshift and SAS

by SAS Employee StephenFoerster on ‎05-25-2017 02:57 PM - edited on ‎05-26-2017 11:13 AM by Community Manager (1,532 Views)

The universe is expanding. You can tell by studying the cosmological redshift. The Cloud Computing universe is also expanding. I can tell by counting the requests for information on Amazon Web Services' Redshift.  AWS is arguably the biggest player in the Cloud space. So their database, Redshift, is naturally popular.

 

Amazon Redshift in a Nutshell

Redshift is an MPP database designed to support reporting, analytics, dashboards, and decisioning. Like Teradata, Redshift distributes its data and processing over multiple hosts allowing it to scale for large implementations. For more details on Redshift, check out this FAQ.

 

SAS integration with Redshift

As of SAS 9.4m4, SAS offers the following Redshift integration:

 

  • A Redshift Libname Engine
  • SQL Pass-Through
  • Bulk Loading
  • In-Database Execution of FREQ, MEANS, RANK, REPORT, SORT, and TABULATE

For a great discussion of all of these features, check out this excellent paper by Chris Dehart and GEL expat, Jeff Bailey. The paper is behind a release but actually talks about the more recent features in its futures section. The paper's highlights include load method comparisons as well as an excellent description of SAS to Redshift Bulkload mechanism.

 

Want to see it in action? Check out Chris Dehart's SAS Tech Talk video:

 

 

Two features standout when discussing Redshift -- Bulk-loading and Pass-through/Push-down. Since Redshift will always exist in an AWS Cloud and your SAS server might not, it is critical that SAS can pass data to Redshift as efficiently as possible and that SAS can pass instructions to Redshift so that data is not needlessly passing from Redshift to SAS for processing.

 

Bulkload

As Jeff and Chris' paper details, SAS' bulkload mechanism works by pushing the output data to AWS S3 and then issuing an S3 COPY command. As of the m4 release, this is all done behind the scenes. Users don't have to know anything about the COPY command or S3. They simply have to specify the required bulkload options (e.g. bulkload=yes) as shown in the following example:

 

libname libred redshift server=rsserver db=rsdb user=myuserID pwd=myPwd port=5439;

DATA libred.myclass(
        bulkload=yes   bl_bucket=myBucket   bl_key=99999   
        bl_secret=12345   bl_default_dir='/tmp'   bl_region='us-east-1');
SET sashelp.class;
RUN;

 

Despite the extra I/O, utilizing the S3 Copy Bulkload mechanism gives SAS impressive performance over even the already-impressive performance of native SAS/Access to Redshift. The following table compares the use-case results presented by Jeff and Chris using different load techniques:

 

SAS to Redshift Load Performance
SAS Access to ODBC (Amazon Driver) 28 minutes
SAS Access to Redshift 54 seconds
SAS Bulkload Mechanism 12.5 seconds

 

Pass-Through / Push-Down

Equally as important as load performance, the capability to pass instructions from SAS to Redshift eliminates data transfers all together since data required for SAS queries/processing can stay in Redshift where it is processed. SAS can pass instructions in three different ways -- Implicit SQL Pass-through; Explicit SQL Pass-Through; And, In-Database Execution of SAS procedures.

 

Implicit Pass-Through

As of SAS 9.4m4, SAS can pass 46 SQL functions as well as most joins to Redshift.

 

Explicit Pass-Through

As with most relational databases, SAS can not only pass PROC SQL code implicitly to Redshift, SAS can also send explicit Redshift commands. This allows for non-standard SQL commands and complete control of the requests being passed. In addition to writing your own SQL, SAS DI Studio can generate explicit SQL giving SAS a user interface for Explicit SQL to Redshift.

 

In-Database Procedures

Finally, SAS can push the FREQ, RANK, REPORT, SORT, MEANS, and TABULATE procedures into Redshift for execution as SQL queries. As with all target platforms, be aware of any limitations.

Your turn
Sign In!

Want to write an article? Sign in with your profile.


Looking for the Ask the Expert series? Find it in its new home: communities.sas.com/askexpert.