We’re smarter together. Learn from this collection of community knowledge and add your expertise.

SAS/ACCESS in SAS 9.4M4 – SAS In-Database Procedures push-down to more data platforms

by SAS Employee Tatyana on ‎01-09-2017 01:34 PM - edited on ‎01-10-2017 03:15 PM by Community Manager (1,159 Views)

One of the core SAS/ACCESS software principles is to enable efficient data processing and minimize data movement between SAS and  external data. If you have your data in a hefty database, efficient data processing would probably benefit from keeping queries execution right in that database, next to data.

 

SAS/ACCESS development teams put a lot of effort into enabling the conversion of SAS code into native to data-platforms  SQL calls so that the delegation of processing happens effectively and implicitly behind the scenes.  You might have heard  references to this process as an "Implicit SQL pass-through". 

 

Various types of SAS code might be eligible for an "Implicit SQL pass-though” once the corresponding SAS/ACCESS software is put in place. For example,  PROC SQL code, certain forms of SAS data step, and a family of frequently used SAS procedures, such as PROC FREQ, RANK, REPORT, SORT, SUMMARY, MEANS, TABULATE.  The latter capability is commonly referred to as "SAS In-Database Procedures push-down".

 

SAS/ACCESS Interfaces provide access from SAS  to dozens of external data platforms, but not all SAS/ACCESS Interfaces are created equal. Each Interface has its own development lifecycle, each has stages of being enriched with capabilities, and some might never get certain capabilities: due to the nature of a particular data platform, some features simply do not make sense. 

 

The Maintenance 4 of SAS 9.4, released in November of 2016, brought improvements across SAS data management product line (Release notes for SAS® Data Management offerings in November 2016), including a series of staged enhancements to several SAS/ACCESS Interfaces. Among these staged enhancements, there is an addition of "SAS In-Database Procedures push-down " capability to more data platforms. The enhanced products are:

 

  • SAS/ACCESS Interface to Amazon Redshift
  • SAS/ACCESS Interface to Microsoft SQL Server
  • SAS/ACCESS Interface to PostgreSQL
  • SAS/ACCESS Interface to Vertica

 

 

This addition expands the list of Interfaces that already had this functionality:

  • SAS/ACCESS Interface to Aster
  • SAS/ACCESS Interface to DB2
  • SAS/ACCESS Interface to Greenplum
  • SAS/ACCESS Interface to Hadoop
  • SAS/ACCESS Interface to Hawq
  • SAS/ACCESS Interface to Impala
  • SAS/ACCESS Interface to Netezza
  • SAS/ACCESS Interface to Oracle
  • SAS/ACCESS Interface to SAP HANA
  • SAS/ACCESS Interface to Teradata

 

If you wonder how one would keep track of what SAS/ACCESS Interfaces have what functionality, take a note of this documentation: SAS/ACCESS 9.4 for Relational Databases: Reference, Ninth Edition , Chapter "In-Database Processing with SAS/ACCESS".

 

Another good reference is the Chapter “Running SAS Procedures inside the Database” in SAS 9.4 In-Database Products: User's Guide, Seventh Edition  .

 

 

It's time to see an example of a SAS In-Database procedure push-down. We'll use Amazon Redshift as the data platform and SAS/ACCESS Interface to Amazon Redshift as the vehicle.  PROC FREQ will be our SAS Procedure.

Do not forget to issue a SASTRACE option to surface details of what exactly was executed in the database.

 

 

LIBNAME myredlib  REDSHIFT  
 SERVER="XXXX"   PORT=XXXX    DATABASE=XXXX
 USER=XXXX  PASSWORD=XXXX;                                             
 

OPTIONS SASTRACE=',,,d' SASTRACELOC=saslog NOSTSUFFIX;


PROC FREQ DATA=myredlib.store_dim;
   TABLES retail_outlet_format_cd*state_region_cd;
   BY retail_outlet_type_cd;
RUN;

 

 

 Here is the most important snippet from the log:

 

PROC_FREQ_Amazon_Redshift.jpg

 

Note the query that was executed in Amazon Redshift. This is the query generated by SAS/ACCESS from the issued PROC FREQ code. Whenever we see that functions (here these are COUNT() and MIN()), a WHERE-clause, a GROUP BY are a part of that query, it is a great news. We are not moving data for any of these operations into SAS, just the results set travels back.

 

Now  it's worth mentioning one more SAS procedure, PROC TRANSPOSE, that is also capable of running inside data platforms. While providing a powerful capability, it uses a different technology, requires an additional software license and configuration, and is a topic for another article.

Contributors
Your turn
Sign In!

Want to write an article? Sign in with your profile.