When you deploy SAS Viya 2021.1 at AWS EKS cluster using viya4-iac-aws and viya4-deployment packages with default options, the EKS cluster will end up in a secured VPC (Virtual Private Cloud). A logically isolated virtual network. The access to and from the outside services is not open until the EKS VPC configuration is updated. Similarly, when you create an AWS Redshift cluster ends up in a separate secured VPC. The Viya 2021.1 (EKS cluster) cannot access the AWS Redshift cluster as both environments are in two different secured VPC.
The viya4-iac-aws supports the use of predefined VPC. However, most of the users prefer to host Viya 2021.1 (K8s) in a separate VPC.
To access the Redshift cluster from CAS requires updating the VPCs inbound and outbound security rules or creating a VPC peering between them. The VPC Peering process updates the inbound and outbound rules on both sides of VPC to enable traffic flow between the Viya 2021.1 services and Redshift Cluster.
This post is about accessing the Redshift Cluster from CAS hosted in separate secured VPCs by creating peering.
The following picture describes the SAS Viya(CAS) access to the Redshift database.
Assuming you have the EKS cluster up and running with Viya 2021.1 software deployed. If you navigate to the EKS cluster configuration page, the networking tab contains the PVC Id. If you click on the VPC Id, it will take you to the VPC dashboard to show the VPC name and details. The following screenshot describes the EKS VPC name and its properties. Keep a note about the VPC name and id used during peering.
Assuming you have a Redshift cluster up and running. If you navigate to Redshift cluster properties, the “network and security” tab contains PVC Id. If you click on the VPC Id, it will take you to the VPC dashboard to show the VPC name and details. The following screenshot describes the Redshift VPC name and its properties. Keep a note about the VPC name and id used during peering.
Assuming both Viya EKS and Redshift cluster are up and running on separate VPCs. If you navigate to the AWS VPC dashboard, there is a section “Peering Connection” to create a peering. On the peering connection editor, select the local VPC as EKS VPC and another VPC to peer with as Redshift VPC, the information you have collected in previous steps. The following screenshot describes the creation of a VPC Peering between EKS VPC and Redshift VPC. Keep a note about the VPC peering name used during the route table update.
Once Peering is created, the VPC peering dashboard will have a new entry with the status as ‘pending acceptance’ by the owner of the target VPC. If you are the owner or have permission on the target VPC, then accept the request or ask for help from the owner of the VPC. The following screenshot describes the acceptance of Peering from target VPC for the connection request.
After accepting the request, the VPC peering will be active.
Once VPC peering is active, you need to include the peering information in the main routing table of both sides of the peering VPCs. Navigate to the EKS VPC properties and look for the "Main route table" and update the routes with additional entry using the IP range of Redshift VPC and Peering name (created in the previous step).
The following screenshot describes new entries in the routing table, one with 172.31.0.0/16 (Redshift VPC IP range) for Peering connection and 0.0.0.0/0 for internet gateway.
Also, verify the subnet association in the routing table. Sometimes, adding a new entry in the routing table does not enable the traffic from other VPC. It requires updating the subnet association of the main route table. If the subnet association tab does not have all the subnet associated, you need to include all the available subnet in this VPC.
Once VPC peering is active, you need to include the peering information in the main routing table of both sides of the peering VPCs. Navigate to the Redshift VPC properties and look for the "Main route table" and update the routes with additional entry using the IP range of EKS VPC and Peering name (created in the previous step).
The following screenshot describes new entries in the routing table, one with 192.168.0.0/16 (EKS VPC IP range) for the Peering connection. If 0.0.0.0/0 for internet gateway entry is not there, you can add it as well.
Once VPC Peering is active and respective VPCs are updated with peering information, CAS can be loaded from the Redshift database table using serial and Multi-node data load mechanisms. The following code describes the CAS load from a Redshift database table using the serial data load method.
Code:
%let userid=utkuma;
%let dbserver="&userid.dmviya4-redshift.cxsygvt812sh.us-east-1.redshift.amazonaws.com";
%let dbuser="viyademo01";
%let dbpwd="XXXXXXX";
%let userdb="geldw" ;
%let schema="public";
CAS mySession SESSOPTS=(CASLIB=casuser TIMEOUT=99 LOCALE="en_US" metrics=true);
caslib RScaslib datasource=(srctype="redshift",username=&dbuser,password=&dbpwd,
server=&dbserver,database=&userdb ,schema=&schema) ;
/*save a CAS table to Redshift */
proc casutil incaslib="RScaslib" outcaslib="RScaslib";
load data=sashelp.cars casout="cars" replace;
save casdata="cars" casout="cars_cas" replace;
list files;
quit ;
/* load a Redshift table to CAS*/
proc casutil incaslib="RScaslib" outcaslib="RScaslib" ;
load casdata="cars_cas" casout="cars_cas" replace;
list tables;
quit ;
cas mysession terminate;
Log extract :
.....
..............
90 caslib RScaslib datasource=(srctype="redshift",username=&dbuser,password=&dbpwd,
91 server=&dbserver,database=&userdb ,schema=&schema) ;
NOTE: Executing action 'table.addCaslib'.
NOTE: 'RSCASLIB' is now the active caslib.
NOTE: Cloud Analytic Services added the caslib 'RSCASLIB'.
NOTE: Action 'table.addCaslib' used (Total process time):
NOTE: Action to ADD caslib RSCASLIB completed for session MYSESSION.
92
93 /*save a CAS table to Redshift */
94 proc casutil incaslib="RScaslib" outcaslib="RScaslib";
NOTE: The UUID '78926dd3-bcc4-844a-af89-b75b90b72613' is connected using session MYSESSION.
95 load data=sashelp.cars casout="cars" replace;
NOTE: The INCASLIB= option is ignored when using the DATA= option in the LOAD statement.
NOTE: Executing action 'table.addTable'.
NOTE: Action 'table.addTable' used (Total process time):
NOTE: SASHELP.CARS was successfully added to the "RSCASLIB" caslib as "CARS".
96 save casdata="cars" casout="cars_cas" replace;
NOTE: Executing action 'table.save'.
NOTE: Performing serial SaveTable action using SAS Data Connector to Redshift.
….
………….
…………………..
101 /* load a Redshift table to CAS*/
102 proc casutil incaslib="RScaslib" outcaslib="RScaslib" ;
NOTE: The UUID '78926dd3-bcc4-844a-af89-b75b90b72613' is connected using session MYSESSION.
103 load casdata="cars_cas" casout="cars_cas" replace;
NOTE: Executing action 'table.loadTable'.
NOTE: Performing serial LoadTable action using SAS Data Connector to Redshift.
NOTE: Cloud Analytic Services made the external data from cars_cas available as table CARS_CAS in caslib RScaslib.
NOTE: Action 'table.loadTable' used (Total process time):
NOTE: The Cloud Analytic Services server processed the request in 0.093455 seconds.
cpu time 0.03 seconds
..........
.................
The following screenshot describes the Redshift database table loaded into CAS.
Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!
Data Literacy is for all, even absolute beginners. Jump on board with this free e-learning and boost your career prospects.