BookmarkSubscribeRSS Feed

SAS Viya 2021.1 and Accessing AWS Redshift Cluster

Started ‎10-01-2021 by
Modified ‎10-01-2021 by
Views 4,371

When you deploy SAS Viya 2021.1 at AWS EKS cluster using viya4-iac-aws and viya4-deployment packages with default options, the EKS cluster will end up in a secured VPC (Virtual Private Cloud). A logically isolated virtual network. The access to and from the outside services is not open until the EKS VPC configuration is updated. Similarly, when you create an AWS Redshift cluster ends up in a separate secured VPC. The Viya 2021.1 (EKS cluster) cannot access the AWS Redshift cluster as both environments are in two different secured VPC.

 

The viya4-iac-aws supports the use of predefined VPC. However, most of the users prefer to host Viya 2021.1 (K8s) in a separate VPC.

 

To access the Redshift cluster from CAS requires updating the VPCs inbound and outbound security rules or creating a VPC peering between them. The VPC Peering process updates the inbound and outbound rules on both sides of VPC to enable traffic flow between the Viya 2021.1 services and Redshift Cluster.

 

This post is about accessing the Redshift Cluster from CAS hosted in separate secured VPCs by creating peering.

Pre-requisites

  • AWS EKS and Redshift in separate secured VPCs
  • Permission to create VPC peering between the two VPCs
  • User access to Redshift Cluster database

Data access path

The following picture describes the SAS Viya(CAS) access to the Redshift database.

 

Viya4_and_Accessing_Redshift_Cluster_1.png

Select any image to see a larger version.
Mobile users: To view the images, select the "Full" version at the bottom of the page.

Steps to create VPC Peering

Collect VPC information from EKS cluster

Assuming you have the EKS cluster up and running with Viya 2021.1 software deployed. If you navigate to the EKS cluster configuration page, the networking tab contains the PVC Id. If you click on the VPC Id, it will take you to the VPC dashboard to show the VPC name and details. The following screenshot describes the EKS VPC name and its properties. Keep a note about the VPC name and id used during peering.

 

Viya4_and_Accessing_Redshift_Cluster_2.png

 

Viya4_and_Accessing_Redshift_Cluster_3.png

Collect VPC information from Redshift cluster

Assuming you have a Redshift cluster up and running. If you navigate to Redshift cluster properties, the “network and security” tab contains PVC Id. If you click on the VPC Id, it will take you to the VPC dashboard to show the VPC name and details. The following screenshot describes the Redshift VPC name and its properties. Keep a note about the VPC name and id used during peering.

 

Viya4_and_Accessing_Redshift_Cluster_4.png

 

Viya4_and_Accessing_Redshift_Cluster_5.png

Creating VPC Peering between Viya EKS and Redshift cluster

Assuming both Viya EKS and Redshift cluster are up and running on separate VPCs. If you navigate to the AWS VPC dashboard, there is a section “Peering Connection” to create a peering. On the peering connection editor, select the local VPC as EKS VPC and another VPC to peer with as Redshift VPC, the information you have collected in previous steps. The following screenshot describes the creation of a VPC Peering between EKS VPC and Redshift VPC. Keep a note about the VPC peering name used during the route table update.

 

Viya4_and_Accessing_Redshift_Cluster_6.png

 

Once Peering is created, the VPC peering dashboard will have a new entry with the status as ‘pending acceptance’ by the owner of the target VPC. If you are the owner or have permission on the target VPC, then accept the request or ask for help from the owner of the VPC. The following screenshot describes the acceptance of Peering from target VPC for the connection request.

 

Viya4_and_Accessing_Redshift_Cluster_7.png

 

After accepting the request, the VPC peering will be active.

 

Viya4_and_Accessing_Redshift_Cluster_8.png

Update EKS VPC main route table with peering connection

Once VPC peering is active, you need to include the peering information in the main routing table of both sides of the peering VPCs. Navigate to the EKS VPC properties and look for the "Main route table" and update the routes with additional entry using the IP range of Redshift VPC and Peering name (created in the previous step).

 

Viya4_and_Accessing_Redshift_Cluster_9.png

 

The following screenshot describes new entries in the routing table, one with 172.31.0.0/16 (Redshift VPC IP range) for Peering connection and 0.0.0.0/0 for internet gateway.

 

Viya4_and_Accessing_Redshift_Cluster_10.png

 

Also, verify the subnet association in the routing table. Sometimes, adding a new entry in the routing table does not enable the traffic from other VPC. It requires updating the subnet association of the main route table. If the subnet association tab does not have all the subnet associated, you need to include all the available subnet in this VPC.

 

Viya4_and_Accessing_Redshift_Cluster_11.png

 

Viya4_and_Accessing_Redshift_Cluster_12.png

Update Redshift VPC main route table with peering connection

Once VPC peering is active, you need to include the peering information in the main routing table of both sides of the peering VPCs. Navigate to the Redshift VPC properties and look for the "Main route table" and update the routes with additional entry using the IP range of EKS VPC and Peering name (created in the previous step).

 

Viya4_and_Accessing_Redshift_Cluster_13.png

 

The following screenshot describes new entries in the routing table, one with 192.168.0.0/16 (EKS VPC IP range) for the Peering connection. If 0.0.0.0/0 for internet gateway entry is not there, you can add it as well.

 

Viya4_and_Accessing_Redshift_Cluster_14.png

CAS load from AWS Redshift cluster database

Once VPC Peering is active and respective VPCs are updated with peering information, CAS can be loaded from the Redshift database table using serial and Multi-node data load mechanisms. The following code describes the CAS load from a Redshift database table using the serial data load method.

 

Code:

 

%let userid=utkuma;

%let dbserver="&userid.dmviya4-redshift.cxsygvt812sh.us-east-1.redshift.amazonaws.com";
%let dbuser="viyademo01";
%let dbpwd="XXXXXXX";
%let userdb="geldw" ;
%let schema="public";

CAS mySession  SESSOPTS=(CASLIB=casuser TIMEOUT=99 LOCALE="en_US" metrics=true);

caslib RScaslib datasource=(srctype="redshift",username=&dbuser,password=&dbpwd,
server=&dbserver,database=&userdb ,schema=&schema) ;

/*save a CAS table to Redshift */
proc casutil incaslib="RScaslib" outcaslib="RScaslib";
 load data=sashelp.cars casout="cars" replace;
 save casdata="cars" casout="cars_cas"  replace;
 list files;
quit ;

/* load a Redshift table to CAS*/
proc casutil incaslib="RScaslib" outcaslib="RScaslib" ;
 load casdata="cars_cas" casout="cars_cas" replace;
 list tables;
quit ;

cas mysession terminate;

 

Log extract :

 

.....
..............
90   caslib RScaslib datasource=(srctype="redshift",username=&dbuser,password=&dbpwd,
91   server=&dbserver,database=&userdb ,schema=&schema) ;
NOTE: Executing action 'table.addCaslib'.
NOTE: 'RSCASLIB' is now the active caslib.
NOTE: Cloud Analytic Services added the caslib 'RSCASLIB'.
NOTE: Action 'table.addCaslib' used (Total process time):
NOTE: Action to ADD caslib RSCASLIB completed for session MYSESSION.
92
93   /*save a CAS table to Redshift */
94   proc casutil incaslib="RScaslib" outcaslib="RScaslib";
NOTE: The UUID '78926dd3-bcc4-844a-af89-b75b90b72613' is connected using session MYSESSION.
95    load data=sashelp.cars casout="cars" replace;
NOTE: The INCASLIB= option is ignored when using the DATA= option in the LOAD statement.
NOTE: Executing action 'table.addTable'.
NOTE: Action 'table.addTable' used (Total process time):
NOTE: SASHELP.CARS was successfully added to the "RSCASLIB" caslib as "CARS".
96    save casdata="cars" casout="cars_cas"  replace;
NOTE: Executing action 'table.save'.
NOTE: Performing serial SaveTable action using SAS Data Connector to Redshift.
….
………….
…………………..
101  /* load a Redshift table to CAS*/
102  proc casutil incaslib="RScaslib" outcaslib="RScaslib" ;
NOTE: The UUID '78926dd3-bcc4-844a-af89-b75b90b72613' is connected using session MYSESSION.
103   load casdata="cars_cas" casout="cars_cas" replace;
NOTE: Executing action 'table.loadTable'.
NOTE: Performing serial LoadTable action using SAS Data Connector to Redshift.
NOTE: Cloud Analytic Services made the external data from cars_cas available as table CARS_CAS in caslib RScaslib.
NOTE: Action 'table.loadTable' used (Total process time):
NOTE: The Cloud Analytic Services server processed the request in 0.093455 seconds.
      cpu time            0.03 seconds
..........
.................

 

The following screenshot describes the Redshift database table loaded into CAS.

 

Viya4_and_Accessing_Redshift_Cluster_15.png

Resources

Version history
Last update:
‎10-01-2021 02:56 PM
Updated by:
Contributors

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

Free course: Data Literacy Essentials

Data Literacy is for all, even absolute beginners. Jump on board with this free e-learning  and boost your career prospects.

Get Started

Article Tags