BookmarkSubscribeRSS Feed

SAS Viya 3.4: What’s New in CAS Data Access?

Started ‎08-27-2018 by
Modified ‎08-27-2018 by
Views 4,121

In the new release of Viya, it is certainly not a surprise to say that there is a lot of new capabilities. Huh? Let’s briefly highlight some of the new options offered regarding the access of data from SAS Cloud Analytic Services (CAS).

 

First new big item on this space is the ability to load data in CAS from an Amazon Web Services (AWS) S3 location. Write-back is also possible. Supported file formats are SASHDAT and CSV.

 

It adds a new option to the Platform Data Sources already available (PATH, HDFS, DNFS) that allow to persist data in the SASHDAT and CSV formats, and it enables to leverage a cloud-based storage service. It supports parallel reads and writes of the data on a MPP CAS infrastructure.

 

CAS access to S3 data is done through the use of a CASLIB. Here is an example of how to create a CASLIB that points to S3:

 

caslib mys3 datasource=(
   srctype="s3",
   bucket="mybucket",
   awsConfigPath="/home/myuser/.aws/config",
   awsCredentialsPath="/home/myuser/.aws/credentials",
   region="US_East",
   objectPath="/mypath/"
) ;

 

 

Second theme in this area is the coming availability of new Data Connectors and Data Connect Accelerators:

  • SAS Data Connector to JDBC
  • SAS Data Connector to MySQL
  • SAS Data Connector to Spark and SAS Data Connect Accelerator for Spark (Limited Availability)
  • SAS Data Connector to Vertica

Regarding Spark, as mentioned, it is planned for Limited Availability, meaning that SAS Product Management must be involved. 

 

JDBC is a good alternative to ODBC for generic access to lots of databases. It is much easier to setup than ODBC. If your CAS environment has the correct setup regarding JAVA, then you only need:

  • To download the JDBC driver for your database, and make it available to CAS
  • To define a JDBC CASLIB by setting the right classpath and class name and by specifying a valid URL

Some examples with PostgreSQL and Oracle:

 

caslib jdbcpg desc='JDBC PostgreSQL' 
   dataSource=(srctype='jdbc',
               url="jdbc:postgresql://mydb.example.com:5432/casdm",
               authenticationdomain="PGAuth",
               schema="public",
               class="org.postgresql.Driver",
               classpath="/opt/sas/jdbc") ;

caslib jdbcora desc='JDBC Oracle' 
   dataSource=(srctype='jdbc',
               url="jdbc:oracle:thin:@//mydb.example.com:1521/XE",
               authenticationdomain="ORAAuth",
               schema="HR",
               class="oracle.jdbc.OracleDriver",
               classpath="/opt/sas/jdbc") ;

 

Last point in this topic is the new FedSQL explicit pass-through facility. Viya 3.3 introduced implicit pass-through. Now it’s time for explicit pass-through.

 

Explicit pass-through enables users to send SQL queries directly to the database for execution. In the CAS context, it means that you will be able to run a specific SQL query inside the database, using the database syntax, and get the results back directly loaded into CAS.

 

Here is an example:

 

caslib PG datasource=(srctype="postgres",authenticationdomain="PGAuth",
   server="mydb.example.com",database="dvdrental",schema="public") libref=PG ;
caslib ORA datasource=(srctype="oracle",authenticationdomain="ORAAuth",
   path="//mydb.example.com:1521/xe",schema="CASDM") libref=ORA ;

proc fedsql sessref=mySession _method ;
   create table PG.myresults{options replace=true} as 
      select film.title
   from connection to PG(select "title", "film_id" from "film") as film,
        connection to ORA(select "film_id" from "film_category" where "category_id"=2) as film_category
   where film.film_id=film_category.film_id ;
quit ;

 

This simple example shows how to send a query directly to PostgreSQL and Oracle and get the results joined in CAS.

 

Explicit pass-through is triggered by the CONNECTION TO clause, is supported with every database CAS supports. The SQL enclosed in the CONNECTION TO clause is a SELECT-type query and must produce a result set that will be routed back to CAS. A Data Definition Language (DDL) statement will not work.

 

That’s all folks. Thanks for reading.

Version history
Last update:
‎08-27-2018 09:51 AM
Updated by:
Contributors

sas-innovate-2024.png

Today is the last day to save with the early bird rate! Register today for just $695 - $100 off the standard rate.

 

Plus, pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

Free course: Data Literacy Essentials

Data Literacy is for all, even absolute beginners. Jump on board with this free e-learning  and boost your career prospects.

Get Started

Article Tags