BookmarkSubscribeRSS Feed

Contemplating Viya’s access to a world of data

Started ‎07-26-2018 by
Modified ‎07-26-2018 by
Views 3,006

world1.png

With the SAS Cloud Analytic Server (CAS) as part of Viya, we have new technologies for loading data directly into our in-memory analytics engine. By expanding the list of suitable data sources which CAS can load from (and write to) directly, then we simplify data management in general and speed up the process of ingesting data for analytics in particular. While the list of direct-access sources has increased with Viya 3.4, it's not universal. And so CAS also can load data indirectly through the conventional use of the Base SAS runtime as well.

 

DIRECT ACCESS TO DATA

There are three major classifications which describe how CAS can get to data directly:

 

Platform Data Sources

world2.png

Platform data sources are those which CAS can work with which do not require additional SAS software licensing. In other words, these sources are available to all CAS servers assuming the environment is sufficient. When CAS is working with a platform source, it typically has the ability to work with SASHDAT and CSV files.

 

  • POSIX-compliant local file systems: also SAS data sets and Microsoft Excel documents
  • Hadoop Distributed File System
  • SAS LASR Analytic Server
  • Amazon S3 (new with Viya 3.4)

 

Serial data transfer (flexible) as well as parallel data transfer (scalable) options for bringing data into CAS from platform sources are available.

 

SAS Data Connectors

SAS Data Connectors are the CAS equivalent of SAS/ACCESS engines for SAS 9 which can access third-party data sources natively. Indeed, with Viya, licensing the SAS/ACCESS product is required to get the SAS Data Connector functionality for CAS. And they work similarly in that CAS typically will also require that the third-party data source's client software must be correctly installed and configured on CAS hosts to handle that native access.

 

The list of sources accessible through SAS Data Connectors is getting longer with each release of Viya: Hadoop, Hive, Impala, ODBC, Oracle, PC Files, PostgreSQL, Amazon Redshift, DB2 (UNIX), SQL Server, SAP HANA, SAS SPD Engine (in HDFS), and Teradata. Newly added with Viya 3.4 we have JDBC, MySQL, Spark, and Vertica, too.

 

Serial data transfer (flexible) as well as multi-node data transfer (scalable) options for bringing data into CAS are available.

 

SAS Data Connect Accelerators

SAS Data Connect Accelerators give CAS the ability to communicate directly with the SAS In-Database Embedded Process (EP). The accelerator products require the associated SAS/ACCESS product. For Viya, the EP can deployed to work with data in:

 

  • Hadoop
  • Teradata
  • Spark (new with Viya 3.4 -- but very limited availability!)

 

The SAS Data Connect Accelerators provide CAS with a parallel data transfer (scalable) technique. If desired, CAS can be programmatically directed to gracefully fallback to multi-node or serial transfer through the associated SAS Data Connector product if the EP is unable to provide data for some reason.

 

 

INDIRECT ACCESS TO DATA

But wait, there are many more data sources in the world than just those listed above. So there's another way to load data into CAS - indirectly through the use of a CAS client. While not as simple as pointing CAS at the data directly, this approach gives us a lot more options on how to get the data we need.

world3.png

 

SAS Viya Compute Server

In particular, we can use the SAS Viya Compute Server which gives us the classic SAS runtime for running our SAS program code. And of course, with the power of original SAS on hand, we have access to a number of native formats like SAS data sets, catalogs, SPD Engine tables, and more.

 

There are also several SAS/ACCESS products which we can use in the Viya Compute Server which do not have equivalent SAS Data Connectors: HAWQ, Netezza, Greenplum, and SAP R/3. And with Viya 3.4, we can now add SAP ASE (formerly known as Sybase) to this list.

 

SAS 9 Compute Server

If SAS 9 is available in your environment, then the list of SAS/ACCESS engines is even longer. They include those already mentioned as well as ADABAS, Aster, CA IDMS TM, SAP IQ, DATACOM/DB, IMS-DL/I, INFORMIX, OLEDB, SYSTEM 2000, and PI System.

 

Moving data from SAS to CAS

Whichever SAS runtime we're working in, we can use the new "cas" libname engine to send data directly over for CAS to work with. If we're using an older version of SAS, then we can still rely on SAS/CONNECT technology as a bridge to handle that transfer.

 

 

KEEPING UP

Technology is always a moving target. And SAS is continuously working towards increasing the scope and functionality of our software. So keep an eye on the documentation and enablement materials - like SAS Communities - to ensure your plan to work with a world of data is making the best use of available resources.

Comments

excellent read, thanks Rob

Version history
Last update:
‎07-26-2018 02:14 PM
Updated by:
Contributors

sas-innovate-2024.png

Don't miss out on SAS Innovate - Register now for the FREE Livestream!

Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.

 

Register now!

Free course: Data Literacy Essentials

Data Literacy is for all, even absolute beginners. Jump on board with this free e-learning  and boost your career prospects.

Get Started

Article Tags