BookmarkSubscribeRSS Feed

SAS Viya 3.4: What’s New in CAS Data Access?

Started ‎08-27-2018 by
Modified ‎08-27-2018 by
Views 4,641

In the new release of Viya, it is certainly not a surprise to say that there is a lot of new capabilities. Huh? Let’s briefly highlight some of the new options offered regarding the access of data from SAS Cloud Analytic Services (CAS).

 

First new big item on this space is the ability to load data in CAS from an Amazon Web Services (AWS) S3 location. Write-back is also possible. Supported file formats are SASHDAT and CSV.

 

It adds a new option to the Platform Data Sources already available (PATH, HDFS, DNFS) that allow to persist data in the SASHDAT and CSV formats, and it enables to leverage a cloud-based storage service. It supports parallel reads and writes of the data on a MPP CAS infrastructure.

 

CAS access to S3 data is done through the use of a CASLIB. Here is an example of how to create a CASLIB that points to S3:

 

caslib mys3 datasource=(
   srctype="s3",
   bucket="mybucket",
   awsConfigPath="/home/myuser/.aws/config",
   awsCredentialsPath="/home/myuser/.aws/credentials",
   region="US_East",
   objectPath="/mypath/"
) ;

 

 

Second theme in this area is the coming availability of new Data Connectors and Data Connect Accelerators:

  • SAS Data Connector to JDBC
  • SAS Data Connector to MySQL
  • SAS Data Connector to Spark and SAS Data Connect Accelerator for Spark (Limited Availability)
  • SAS Data Connector to Vertica

Regarding Spark, as mentioned, it is planned for Limited Availability, meaning that SAS Product Management must be involved. 

 

JDBC is a good alternative to ODBC for generic access to lots of databases. It is much easier to setup than ODBC. If your CAS environment has the correct setup regarding JAVA, then you only need:

  • To download the JDBC driver for your database, and make it available to CAS
  • To define a JDBC CASLIB by setting the right classpath and class name and by specifying a valid URL

Some examples with PostgreSQL and Oracle:

 

caslib jdbcpg desc='JDBC PostgreSQL' 
   dataSource=(srctype='jdbc',
               url="jdbc:postgresql://mydb.example.com:5432/casdm",
               authenticationdomain="PGAuth",
               schema="public",
               class="org.postgresql.Driver",
               classpath="/opt/sas/jdbc") ;

caslib jdbcora desc='JDBC Oracle' 
   dataSource=(srctype='jdbc',
               url="jdbc:oracle:thin:@//mydb.example.com:1521/XE",
               authenticationdomain="ORAAuth",
               schema="HR",
               class="oracle.jdbc.OracleDriver",
               classpath="/opt/sas/jdbc") ;

 

Last point in this topic is the new FedSQL explicit pass-through facility. Viya 3.3 introduced implicit pass-through. Now it’s time for explicit pass-through.

 

Explicit pass-through enables users to send SQL queries directly to the database for execution. In the CAS context, it means that you will be able to run a specific SQL query inside the database, using the database syntax, and get the results back directly loaded into CAS.

 

Here is an example:

 

caslib PG datasource=(srctype="postgres",authenticationdomain="PGAuth",
   server="mydb.example.com",database="dvdrental",schema="public") libref=PG ;
caslib ORA datasource=(srctype="oracle",authenticationdomain="ORAAuth",
   path="//mydb.example.com:1521/xe",schema="CASDM") libref=ORA ;

proc fedsql sessref=mySession _method ;
   create table PG.myresults{options replace=true} as 
      select film.title
   from connection to PG(select "title", "film_id" from "film") as film,
        connection to ORA(select "film_id" from "film_category" where "category_id"=2) as film_category
   where film.film_id=film_category.film_id ;
quit ;

 

This simple example shows how to send a query directly to PostgreSQL and Oracle and get the results joined in CAS.

 

Explicit pass-through is triggered by the CONNECTION TO clause, is supported with every database CAS supports. The SQL enclosed in the CONNECTION TO clause is a SELECT-type query and must produce a result set that will be routed back to CAS. A Data Definition Language (DDL) statement will not work.

 

That’s all folks. Thanks for reading.

Version history
Last update:
‎08-27-2018 09:51 AM
Updated by:
Contributors

sas-innovate-white.png

Special offer for SAS Communities members

Save $250 on SAS Innovate and get a free advance copy of the new SAS For Dummies book! Use the code "SASforDummies" to register. Don't miss out, May 6-9, in Orlando, Florida.

 

View the full agenda.

Register now!

SAS AI and Machine Learning Courses

The rapid growth of AI technologies is driving an AI skills gap and demand for AI talent. Ready to grow your AI literacy? SAS offers free ways to get started for beginners, business leaders, and analytics professionals of all skill levels. Your future self will thank you.

Get started

Article Tags