One of the core SAS/ACCESS software principles is to enable efficient data processing and minimize data movement between SAS and external data. If you have your data in a hefty database, efficient data processing would probably benefit from keeping queries execution right in that database, next to data.
SAS/ACCESS development teams put a lot of effort into enabling the conversion of SAS code into native to data-platforms SQL calls so that the delegation of processing happens effectively and implicitly behind the scenes. You might have heard references to this process as an "Implicit SQL pass-through".
Various types of SAS code might be eligible for an "Implicit SQL pass-though” once the corresponding SAS/ACCESS software is put in place. For example, PROC SQL code, certain forms of SAS data step, and a family of frequently used SAS procedures, such as PROC FREQ, RANK, REPORT, SORT, SUMMARY, MEANS, TABULATE. The latter capability is commonly referred to as "SAS In-Database Procedures push-down".
SAS/ACCESS Interfaces provide access from SAS to dozens of external data platforms, but not all SAS/ACCESS Interfaces are created equal. Each Interface has its own development lifecycle, each has stages of being enriched with capabilities, and some might never get certain capabilities: due to the nature of a particular data platform, some features simply do not make sense.
The Maintenance 4 of SAS 9.4, released in November of 2016, brought improvements across SAS data management product line (Release notes for SAS® Data Management offerings in November 2016), including a series of staged enhancements to several SAS/ACCESS Interfaces. Among these staged enhancements, there is an addition of "SAS In-Database Procedures push-down " capability to more data platforms. The enhanced products are:
This addition expands the list of Interfaces that already had this functionality:
If you wonder how one would keep track of what SAS/ACCESS Interfaces have what functionality, take a note of this documentation: SAS/ACCESS 9.4 for Relational Databases: Reference, Ninth Edition , Chapter "In-Database Processing with SAS/ACCESS".
Another good reference is the Chapter “Running SAS Procedures inside the Database” in SAS 9.4 In-Database Products: User's Guide, Seventh Edition .
It's time to see an example of a SAS In-Database procedure push-down. We'll use Amazon Redshift as the data platform and SAS/ACCESS Interface to Amazon Redshift as the vehicle. PROC FREQ will be our SAS Procedure.
Do not forget to issue a SASTRACE option to surface details of what exactly was executed in the database.
LIBNAME myredlib REDSHIFT SERVER="XXXX" PORT=XXXX DATABASE=XXXX USER=XXXX PASSWORD=XXXX; OPTIONS SASTRACE=',,,d' SASTRACELOC=saslog NOSTSUFFIX; PROC FREQ DATA=myredlib.store_dim; TABLES retail_outlet_format_cd*state_region_cd; BY retail_outlet_type_cd; RUN;
Here is the most important snippet from the log:
Note the query that was executed in Amazon Redshift. This is the query generated by SAS/ACCESS from the issued PROC FREQ code. Whenever we see that functions (here these are COUNT() and MIN()), a WHERE-clause, a GROUP BY are a part of that query, it is a great news. We are not moving data for any of these operations into SAS, just the results set travels back.
Now it's worth mentioning one more SAS procedure, PROC TRANSPOSE, that is also capable of running inside data platforms. While providing a powerful capability, it uses a different technology, requires an additional software license and configuration, and is a topic for another article.