BookmarkSubscribeRSS Feed

Accessing Database Data in SAS® Viya®

Started ‎07-06-2023 by
Modified ‎07-06-2023 by
Views 3,590

If you missed my Ask the Expert session “Accessing Database Data in SAS® Viya®”, no worries! You can catch it on-demand any time you like. Here’s the link – registration is required, but it’s free!

Watch the webinar

In this webinar, I discuss how modern analytics requires accessing data from a wide variety of sources, and how SAS has always made that easy with its powerful SAS/ACCESS® interfaces. As data growth accelerates, performance becomes more critical. I demonstrate how you can use your SAS®9 programming skills in the SAS Viya compute server, then show how to use the power of CAS to make processing database data faster than ever.


You will learn:

  • The differences between the Viya compute server and CAS, and when to use each.
  • Tips for accelerating database data processing in SAS®9 and on the Viya compute server.
  • The practical differences between traditional SAS libraries and caslibs.
  • How to work with database-backed caslibs in CAS.

 The questions from the Q&A segment held at the end of the webinar are listed below and you can download a ZIP file containing a PDF with slides & notes and sample code from this link.

 

Q&A

How do we keep our SAS/ACCESS libname connection alive longer without getting timed out?

A LIBNAME connection timeout can be caused by many things – unstable network connections, a database-specific limit on the length of time a connection can be idle, etc. If there is a firewall between the SAS Compute server and the database, there may be a firewall timeout issue as well. This is really an architecture / system issue, and you’ll need to consult your SAS Administrator and perhaps your SYSADMIN or DBA to find and correct the cause of the dropped connections.

 

How have you found it best to reduce the overhead of data getting larger when loading to CAS?

There are a lot of moving parts in this one. First, the encoding of the original data makes a difference. CAS data is all UTF 8 encoded, so there is a natural tendency for data to get bigger. For example, there is a significant bytes-per-character difference between UTF-8 and Latin-1 encoding. There are also size differences attributable to the way CAS data is mapped to disk. You can find a good discussion of this (and more) in Kevin Russell’s SAS Global Forum paper titled “Next Steps: Important Considerations for Moving Your Data and Formats into CAS”.

 

Many of my clients have Snowflake (once Teradata) now - and we are often using pass-through to SAS in our work. Do you have any guidance for pulling Snowflake data/summarizing Snowflake data to Viya using SAS?

I have just started playing with Snowflake data myself. The traditional techniques SAS programmers use with databases are all valid for Snowflake, too. Snowflake has the advantage of that “semantic layer” which can dramatically reduce the need for data duplication. You can use it to provide users access to the results of table joins, etc. in a much more a less burdensome way than most other databases. But in the end, the goal is still the same: maximize in-database processing for sorting, subsetting, and summarization. The techniques discussed in this seminar (and covered in more depth in the “Efficiency Tips for Database Programming in SAS®” training course) definitely apply. By the way, SAS and Snowflake just jointly announced a new Snowflake feature, Snowpark Container Services, that allows you to run SAS® Viya® AI and decisioning capabilities securely right in the Snowflake Data Cloud! Depending on your intended end-use for the data in Viya, this might be something you would want to investigate.

 

Will using explicit pass-through to the database help to overcome connection timeout problems I’ve experienced when using LIBNAME access to the database?

Yes, it should. A LIBNAME statement connection to the database remains open until the libref is cleared. Often, you are not actively using the connection for much of the SAS session, and the database may have timeout limitations on idle connections implemented. For explicit pass through with PROC SQL, we must use the CONNECT statement to establish a fresh connection each time PROC SQL is invoked, then execute our explicit pass-through code. PROC FedSQL takes that one step further, and automatically connects, executes, then disconnects for every explicit pass-through statement. These operations are timelier, and therefore less likely to timeout than LIBNAME access.

 

I'm not a database user; I rely on small datasets maintained only in SAS. Is there utility in still learning SAS Viya or CAS?

SAS Viya has an enormous number of visualization and reporting tools. Visual Analytics and Visual Statistics are good examples, and they all rely on CAS tables. Even if your data isn’t enormous, it must be in CAS to be accessible to those tools. So, if you're interested leveraging the advanced Viya visualization technology, machine learning tools, or any of the innovative technology coming out in Viya, you’ll want to put your data in CAS no matter the size or original source. So that's a good reason to learn.

 

What is the difference between SAS/ACCESS and SAS Data connectors, in layman’s terms?

SAS/ACCESS connects a database to the SAS Compute Server in SAS 9 or in SAS Viya. A SAS Data Connector connects a database to the data portion of a caslib in CAS. Data Connectors are designed from the ground up for efficient data access SAS Viya’s massively parallel, in-memory analytic processing engine.

 

Does PROC TABULATE go to FEDSQL as well as PROC MEANS?

PROC TABULATE and PROC MEANS do not generate FedSQL. If the DATA= option references a database table, the SAS/ACCESS engine will convert as much as possible of the code into database-specific SQL.

 

Is PROC FEDSQL available in Base SAS, or does it need Viya to run?

Both PROC FedSQL and PROC DS2 are available in base SAS. The earliest production release for both was the SAS 9.4 (2013) release. All versions of SAS Viya include PROC FedSQL and PROC DS2.

 

Do FedSQL and DS2 require a SAS/ACCESS license? Can FEDSQL and DS2 connect to a database outside of SAS/ACCESS and process them fast?

Both PROC DS2 and PROC FEDSQL are part of base SAS in all versions since the release of SAS 9.4 (summer of 2013), so no special license is required to use those languages. However, accessing database data in SAS always requires a SAS/ACCESS license for the database in question. If a FedSQL threaded driver is available for a database, the driver is automatically included with the SAS/ACCESS license. In SAS 9, SAS/ACCESS is usually licensed “a la carte”, so you’ll need to check to see if you have a license for the database you want to access. But there’s good news for SAS Viya users:  Viya comes with a broad portfolio of SAS/ACCESS products, so you can access most databases without worrying about licenses using both Compute Server and CAS code.

 

Could SQL run the MAGIC code option in SAS Viya?

The undocumented PROC SQL MAGIC option worked as expected in the SAS Compute server when I tested it in SAS Viya Long-Term Support 2023.03.

 

Can CAS use "USER FORMAT Catalogs"? Or have CAS generate the USER FORMATs?

Yes, you can make SAS user formats available to CAS. See the “Managing User-Defined Formats on the SAS Viya Platform” topic in the online documentation and Kevin Russell’s SAS Global Forum paper titled “Next Steps: Important Considerations for Moving Your Data and Formats into CAS”.

 

Do you recommend adjusting the READBUFF option in libname or proc SQL explicit pass-through to improve query speed? I've found it helpful when querying Oracle through proc SQL explicit pass-through.

The READBUFF= option controls the number of rows read from the DBMS into the in-memory buffer for each fetch operation. Decreasing the number of fetch operations can decrease network activity, which an increasing overall performance. But higher values for READBUFF= use more memory, which can lead to memory management issues and decrease performance. When speed is critical it’s worthwhile to try different values for READBUFF= as you benchmark your code to see if it improves speed. However, the READBUFF= option is applied as either a SAS dataset option or LIBNAME statement option, so it only affects IMPLICIT pass-through. It can’t be used in DBMS SQL written for explicit pass-through.

 

What is the most optimal option for high-volume star models? If you are sourcing database data for analysis with one of the base SAS procedures capable of in-database processing and will not need to access the database data again in your process, you will usually get faster results using a well-crafted SAS procedure step and LIBNAME access (implicit passthrough). If the same results set from your database query will be accessed multiple times, consider creating a SAS dataset or CAS table from the query result and using that for subsequent processing. For this technique, an explicit passthrough query in FedSQL or PROC SQL will probably yield the fastest results.

 

Recommended Resources

SAS Viya With SingleStore: Maximize the Value of Your Data

Accessing Databases in the Cloud – SAS Data Connectors and Microsoft Azure

We’ve all gone to cloud -- but what about my SAS data(base)?

SAS Viya Best practices with Snowflake Data

SAS/ACCESS® for Relational Databases

CAS Data Connectors

Moving from SAS®9 to SAS® Viya®

Move to Viya Board

 

Want more tips? Be sure to subscribe to the Ask the Expert board to receive follow up Q/A, slides and recordings from other SAS Ask the Expert webinars. Just hit SUBSCRIBE here:

 

SASJedi_1-1688662748676.png

 

 

Contributors
Version history
Last update:
‎07-06-2023 01:27 PM
Updated by:

sas-innovate-2026-white.png



April 27 – 30 | Gaylord Texan | Grapevine, Texas

Registration is open

Walk in ready to learn. Walk out ready to deliver. This is the data and AI conference you can't afford to miss.
Register now and lock in 2025 pricing—just $495!

Register now

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Article Tags