BookmarkSubscribeRSS Feed

Kubernetes Primer for SAS Viya: Networking

Started ‎12-17-2020 by
Modified ‎12-17-2020 by
Views 6,323

With the new SAS Viya (2020.1 and later) running in a Kubernetes environment, there are a myriad of new concepts to get a grasp on. Let's take a quick look at some of the networking aspects presented by Kubernetes which SAS Viya will utilize. In particular, we'll look at three routes by which SAS Viya services can be reached from outside the cluster.

 

Services

Let's kick things off with k8s services. The idea is that you've got a SAS Viya service, like SAS Logon, and three instances of sas-logon are running. Which one do you connect to? How do you choose? What happens if one goes away and a new one spins up? Keeping track of exactly what is where in a k8s cluster is hard to do on your own. You might think you know how to get to the SAS Logon service, but its pod(s) have moved to different nodes. You might find the door locked - or the door to something else - or no door at all.

 

But the real answer is: Don't worry about it. We can define a k8s service that acts as a single point of contact and it'll keep track of where the three sas-logon pods are running for us, automatically routing traffic to them. Sounds easy, right?

 

Service type: ClusterIP

This is the default type of k8s service. It's only used for internal communication between pods inside the k8s cluster. You'll see it used a lot by the SAS Viya services which don't have external clients (i.e. most of them).

 

rc_1-ClusterIP.png

This illustration shows the ClusterIP service defined to listen for requests intended for the associated group of replicated pods. In other words, the ClusterIP acts like a reverse proxy and it keeps track of where the pods are, which are available for work as well as which pod is handling which requests. How is this defined? Well, you gotta brush up on your YAML skills. For this ClusterIP service, the YAML looks similar to:

 

---
apiVersion: v1
kind: Service
metadata: 
  name: my-sasviya-app-service
spec:
  selector: 
    app: my-sasviya-app
  type: ClusterIP
  ports: 
  - name: http
    port: 80
    targetPort: 80
    protocol: TCP

 

 

And you can see from this that we've defined an object of kind "service" and type "ClusterIP" and named it my-sasviya-app-service. This particular ClusterIP service is set up to listen on TCP port 80 and whatever comes in there will get routed to one of the my-sasviya-app pods on-target port 80.  

 

Service type: NodePort

You can define a NodePort type of k8s service so that all of the k8s nodes (i.e. host machines) will listen at the same port for the traffic you specify and then redirect that traffic to the internal ClusterIP of the target service. This is a brute-force approach and not very elegant. For SAS Viya, we try to avoid the use of NodePorts when possible, but in some circumstances, they can be useful.

 

rc_2-NodePort.png

 

With the NodePort defined, we can direct communication to any one of those host machines at the specified NodePort value, and then the communication will be forwarded to the specified app.

 

---
apiVersion: v1
kind: Service
metadata:  
  name: my-sasviya-app-nodeport
spec:
  selector:    
    app: my-sasviya-app-service
  type: NodePort
  ports:  
  - name: http
    port: 80
    targetPort: 80
    nodePort: 33080
    protocol: TCP

 

 

This YAML defines a NodePort service which exposes 2 ports: 33080 for clients outside the cluster and 80 for clients inside the cluster. So traffic coming into this NodePort from inside or outside the cluster is then directed to the target port of the my-sasviya-app-service ClusterIP. NodePorts have some interesting restrictions, too:

  • A NodePort can only expose 1 service. If you have multiple services to share, then define a NodePort for each.
  • Only ports 32000 - 32768 are available for use by NodePorts
  • The client is responsible for keeping track of the node's hostname/IP address. If that changes, then the client must correct its connection profile.

 

Service type: LoadBalancer

A LoadBalancer service is usually the default method for exposing a service to clients outside the cluster. The interesting thing about a LoadBalancer is that it operates at the Transport Layer (level 4) of the OSI Network Model. This is where TCP networking (as well as TLS encryption) operates and means that a LoadBalancer can be used for pretty much any kind of network communication but that there is no intelligence as to what the content is or how to handle it.

 

rc_3-LoadBalancer.png

With a LoadBalancer, we can direct practically any form of network traffic into the SAS Viya pods of the k8s cluster: HTTP, TCP or UDP, IMAP, SMTP, really anything. As long as it makes sense to the SAS Viya app on the receiving end, it's all good.

 

---
apiVersion: v1
kind: Service
metadata:
  name: my-sasviya-app-lb
spec:
  selector:
    app: my-sas-viya-app-service
  type: LoadBalancer
  ports:
  - port: 80
    targetPort: 80

 

Similar to the NodePort, this YAML defines an incoming port 80 , and anything that comes in there will route to the my-sas-viya-app-service ClusterIP at target port 80. Probably the biggest challenge with LoadBalancers is that they can be costly to use in the cloud. And you'll need a LoadBalancer defined for each and every service you want to expose. Fortunately, for SAS Viya we don't need to expose many services to clients outside of the k8s infrastructure, but it's still a consideration to weigh.

 

 

Ingress

An Ingress is not a type of k8s service, but we do use it to help route traffic from external clients to SAS Viya services inside the k8s cluster. For SAS Viya 2020.1, we currently only support the NGINX Ingress Controller, but the long-term objective is for SAS to accommodate any Ingress Controller technology. Ingresses only work with HTTP traffic - and from the OSI Network Model perspective, they operate at Application Layer (level 7). As opposed to LoadBalancers, Ingresses are cognizant of the HTTP protocol and its payload. This means an Ingress can route traffic to different services based on the content of the URL… which is very cool.

 

rc_4-Ingress.png

 One Ingress can act as the gateway to multiple services at once. And you get to define the rules that make it happen.

 

---
apiVersion: extensions/v1beta1
kind: Ingress
metadata:
  name: my-sasviya-app-ingress
spec:
  backend:
    serviceName: default-sasviya-app
    servicePort: 8080
  rules:
  - host: foo.sasviya.site.com
    http:
      paths:
      - backend:
          serviceName: sasviya-app-foo
          servicePort: 8080
  - host: sasviya.site.com
    http:
      paths:
      - path: /bar/*
        backend:
          serviceName: sasviya-app-bar
          servicePort: 8080

 

Ingresses are very powerful and flexible - but that does make them a little more complicated to define. Shown in this YAML is an Ingress with routes defined to 3 different apps. The route selected will be determined by the content of the HTTP URL:

 

  1. http://foo.sasviya.site.com ==> routes to ==> sasviya-app-foo service
  2. http://sasviya.site.com/bar ==> routes to ==> sasviya-app-bar service
  3. All other url variations will route to the default-sasviya-app service.

In this way, a single Ingress can handle routing traffic to any number of SAS Viya services inside the k8s cluster. Depending on your infrastructure provider, this will likely implement a single LoadBalancer (that you pay for) which means your SAS Viya services running across many pods on any number of nodes are reachable from a single IP address. And this barely scratches the surface of what Ingresses are capable of.

 

Furthermore, there are plugins for the Ingress Controllers, too. One, in particular, is cert-manager which SAS Viya relies on to automatically provision SSL/TLS certificates for its services. Be sure to check out Stuart Rogers' post, SAS Viya 2020.1 (and higher) TLS Secure by Default – provided by cert-manager to see the benefits.  

 

SAS Viya endpoints

SAS Viya takes advantage of these various inbound routing technologies depending on the source of the request and its target. So, for example, when one SAS Viya microservice needs to communicate with another, it tries the most direct route with the network traffic staying inside the cluster and utilizing the target service's ClusterIP.

 

But what about end-users whose PCs, tablets, and other devices which are outside of the k8s cluster and the cloud infrastructure on which it's hosted? For web browser access to SAS Viya applications like SAS Visual Analytics, SAS Studio, SAS Environment Manager, and others then SAS Viya relies on Ingresses to capture the incoming HTTP traffic and deliver it to the desired service endpoint.

 

Things get a little different for non-HTTP clients. Consider programmatical clients of SAS Cloud Analytic Services (SAS 9.4, R, Python, Java, Lua, etc.) trying to reach the CAS binary API on the CAS Controller pod's port 5570. Instead of an Ingress, we expect to use a LoadBalancer service instead.

 

One area we're still getting figured out for SAS Viya 2020.1 is how the SAS In-Database Embedded Process communicates with CAS. When CAS wants data from the EP, it advertises its worker's hostnames and DC ports to the EP. Then the EP instances will attempt to initiate communication with all of the CAS workers to send over the requested data. So far we've found that NodePorts (free) can get the job done if you're running your own k8s cluster, but if you're running in the cloud using something like Azure Kubernetes Service, then we need a LoadBalancer (costly) for each CAS Worker instead. SAS R&D is still refining this to get to the best solution.

 

What's next?

It's time to get your hands dirty. The only way to get comfortable with SAS Viya running in a Kubernetes cluster is to try working with it yourself. The deployment and configuration steps described in this post are challenging to understand without real-life experience. I can't give you a k8s cluster, but if you've already got one (or know how to set one up), then the SAS documentation helps by describing how to configure external access to SAS Viya services... see SAS® Viya® Operations > 2020.1 > Deployment > Common Customizations

 

--

 

Note: This article addresses features of SAS Viya 2020.1 (and later).

 

 

Version history
Last update:
‎12-17-2020 10:20 AM
Updated by:
Contributors

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

Free course: Data Literacy Essentials

Data Literacy is for all, even absolute beginners. Jump on board with this free e-learning  and boost your career prospects.

Get Started