BookmarkSubscribeRSS Feed

Connecting SAS Viya in the Cloud to on-Premises Data Sources

Started ‎04-02-2024 by
Modified ‎04-02-2024 by
Views 208

With its integrated SAS Compute server being an enhanced version of the traditional SAS engine, SAS Viya users have to change very little of their legacy SAS code when their implementation is configured to access the required data. For the most part, the SAS library definitions are all that need to change.

 

So how can we configure data access to on-premises data from SAS Viya in the cloud so we can migrate programs as easily as possible? We'll examine some strategies below.

 

1. SAS Cloud Data Exchange

 

SAS' Cloud Data Exchange (CDE) allows SAS Viya to access on-premises databases and odbc/jdbc sources as well as SAS Data Sets.

 

CDE utilizes a co-located data agent that sits within the SAS Viya cluster on the cloud and a remote data agent that sits on-premises. These services connect and pass the data back and forth between the cloud and the on-site network.

 

When utilizing CDE, SAS libraries need to be modified to access the CDE data source reference instead of the direct reference from when the code ran on-premises.

 

2. Cloud Provider Network Connections

 

Every modern cloud provider offers some mechanism(s) for connecting cloud networks securely with on-premises networks.

 

 

Each of these technologies provides a network connection between your SAS Viya implementation and all of your on-premises resources. When using a technology like this, existing SAS programs will require almost no modification as the cloud resources should look like just other nodes on the network.

 

For more information on connecting SAS Viya on Azure to on-premises with VPN and ExpressRoute gateways see this post, Connecting Viya in Azure to On-Prem with Azure VPN, ExpressRoute (Intro).

 

3. Data Pipelines

 

Data Pipelines are sets of processes that move data from one platform to another. Pipelines can be used to synchronize key data sources from on-premises to SAS Viya in the cloud.

 

Like traditional ETL processes, data pipelines require development and regular maintenance. Also like ETL, initial loads, incremental loads, data transformation, required computing resources, execution frequency, and data staleness are all considerations.

 

When utilizing data pipelines, migrated programs may have to change more liberally as data storage formats may change. For example, database content may be migrated to parquet files in the cloud; Excel files may be migrated to CSVs; And so on.

 

For an example of a SAS Viya data pipeline using Singlestore see this post, SAS Viya with SingleStore: Near Real-Time Dashboarding Using SAS Visual Analytics.

 

4. Database Replication

 

Some on-premises data servers like Microsoft SQL Server offer replication capabilities to corresponding data servers in the cloud.

 

5. On Premises Cloud

 

Locating the cloud on premises using OpenShift is a completely different approach to those above. In this scenario, SAS Viya is deployed, not on a remote cloud provider like Azure, AWS, or GCP, but on local hardware using the OpenShift container platform. In this type of deployment, on-premises files can be accessed via mechanisms like the Kubernetes NFS storage class and databases can be accessed via mechanisms like traditional TCP/IP.

 

For information on deploying SAS Viya on Openshift, see this blog post, SAS Viya on Red Hat OpenShift – Part 1.

 

Further Considerations

 

Each of the strategies above comes with its own unique requirements including hardware, network, security, data latency, etc.. A Viya implementation may employ many different strategies for different data sources for reasons like data volumes, data volatility, and physical distance to the source.

 

Has your customer employed one of these strategies? Are there other strategies that they've employed? Thanks!

Version history
Last update:
‎04-02-2024 12:36 PM
Updated by:
Contributors

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

Free course: Data Literacy Essentials

Data Literacy is for all, even absolute beginners. Jump on board with this free e-learning  and boost your career prospects.

Get Started

Article Labels
Article Tags