One of the core SAS/ACCESS software principles is to enable efficient data processing and minimize data movement between SAS and external data. If you have your data in a hefty database, efficient data processing would probably benefit from keeping queries execution right in that database, next to data.
SAS/ACCESS development teams put a lot of effort into enabling the conversion of SAS code into native to data-platforms SQL calls so that the delegation of processing happens effectively and implicitly behind the scenes. You might have heard references to this process as an "Implicit SQL pass-through".
Various types of SAS code might be eligible for an "Implicit SQL pass-though” once the corresponding SAS/ACCESS software is put in place. For example, PROC SQL code, certain forms of SAS data step, and a family of frequently used SAS procedures, such as PROC FREQ, RANK, REPORT, SORT, SUMMARY, MEANS, TABULATE. The latter capability is commonly referred to as "SAS In-Database Procedures push-down".
SAS/ACCESS Interfaces provide access from SAS to dozens of external data platforms, but not all SAS/ACCESS Interfaces are created equal. Each Interface has its own development lifecycle, each has stages of being enriched with capabilities, and some might never get certain capabilities: due to the nature of a particular data platform, some features simply do not make sense.
The Maintenance 4 of SAS 9.4, released in November of 2016, brought improvements across SAS data management product line (Release notes for SAS® Data Management offerings in November 2016), including a series of staged enhancements to several SAS/ACCESS Interfaces. Among these staged enhancements, there is an addition of "SAS In-Database Procedures push-down " capability to more data platforms. The enhanced products are:
SAS/ACCESS Interface to Amazon Redshift
SAS/ACCESS Interface to Microsoft SQL Server
SAS/ACCESS Interface to PostgreSQL
SAS/ACCESS Interface to Vertica
This addition expands the list of Interfaces that already had this functionality:
SAS/ACCESS Interface to Aster
SAS/ACCESS Interface to DB2
SAS/ACCESS Interface to Greenplum
SAS/ACCESS Interface to Hadoop
SAS/ACCESS Interface to Hawq
SAS/ACCESS Interface to Impala
SAS/ACCESS Interface to Netezza
SAS/ACCESS Interface to Oracle
SAS/ACCESS Interface to SAP HANA
SAS/ACCESS Interface to Teradata
If you wonder how one would keep track of what SAS/ACCESS Interfaces have what functionality, take a note of this documentation: SAS/ACCESS 9.4 for Relational Databases: Reference, Ninth Edition , Chapter "In-Database Processing with SAS/ACCESS".
Another good reference is the Chapter “Running SAS Procedures inside the Database” in SAS 9.4 In-Database Products: User's Guide, Seventh Edition .
It's time to see an example of a SAS In-Database procedure push-down. We'll use Amazon Redshift as the data platform and SAS/ACCESS Interface to Amazon Redshift as the vehicle. PROC FREQ will be our SAS Procedure.
Do not forget to issue a SASTRACE option to surface details of what exactly was executed in the database.
LIBNAME myredlib REDSHIFT
SERVER="XXXX" PORT=XXXX DATABASE=XXXX
USER=XXXX PASSWORD=XXXX;
OPTIONS SASTRACE=',,,d' SASTRACELOC=saslog NOSTSUFFIX;
PROC FREQ DATA=myredlib.store_dim;
TABLES retail_outlet_format_cd*state_region_cd;
BY retail_outlet_type_cd;
RUN;
Here is the most important snippet from the log:
Note the query that was executed in Amazon Redshift. This is the query generated by SAS/ACCESS from the issued PROC FREQ code. Whenever we see that functions (here these are COUNT() and MIN()), a WHERE-clause, a GROUP BY are a part of that query, it is a great news. We are not moving data for any of these operations into SAS, just the results set travels back.
Now it's worth mentioning one more SAS procedure, PROC TRANSPOSE, that is also capable of running inside data platforms. While providing a powerful capability, it uses a different technology, requires an additional software license and configuration, and is a topic for another article.
... View more