For SAS Viya 3.5 I wrote a series of posts on real-time integration and implementing High Availability (HA) for the SAS Micro Analytic Service (MAS). Now that SAS Viya is running on Kubernetes, I thought it was about time to revisit this topic.
Running on Kubernetes presents some new options when it comes to high availability. But as always it is important to understand the requirements and the drivers for the system available targets.
I say “system”, as when SAS Viya is supporting real-time business processing it is just one part of the business transaction. There may be upstream and downstream processing.
Let’s look at some of the considerations…
SAS Viya availability targets can’t be viewed in isolation. The Kubernetes cluster also needs to be capable of meeting the targets. Hence, it is important to understand the Kubernetes infrastructure and how resilient it is.
On this thread, the capabilities available will vary depending on whether the Kubernetes platform is running on-premises or in the Cloud. The Cloud Providers typically offer features such as Availability Zones, which can be used to protect resources in the event of data centre outages.
If a single Kubernetes cluster can’t meet the availability requirements, then multiple clusters might be required or making use of multiple availability zones. That is, the Recover Time Objectives (RTO) is driving the need for multiple Kubernetes clusters, and maybe even geographic/datacentre separation.
This brings us back to using what I called a “Shared nothing” deployment. The shared nothing architecture pattern involves implementing multiple standalone SAS Viya platforms, with a load balancer frontend. If your SAS Viya platform is deployed to run on-premises instead of in the cloud, then the approach outlined in that post is still an option to consider.
Let’s assume that the SAS Viya platform is dedicated to running MAS and supporting the real-time transactions. You could use the HA patch transformer to implement HA (see here), but this could be overkill as it provides 2 pods for each stateless service.
One of the nice things about running on Kubernetes is that it is possible to focus on individual services. So, it is possible to take a targeted approach to implementing HA. That is, just focus on MAS and the core Viya services that support MAS.
The stateful services are deployed with redundancy by default, so you just need to focus on the following services:
All the above services are controlled using a HorizontalPodAutoscaler (HPA). To get a complete list of the HPAs in the SAS Viya deployment you can use the following command.
kubectl -n namespace get hpa
It is possible to create a set of patch transformers to focus on these components, to update the HPA definitions. As part of creating the patch transformers, you need to determine the number of replicas required. I would suggest 2 or 3. I can’t really see the need to go past having 3 replicas.
While it is just for Logon and Identities, the following is an example patch transformer, that sets the replicas to two. As the update is to address HA and not workload scalability, you can see that I have set the minReplicas and maxReplicas values both to two replicas.
---
apiVersion: builtin
kind: PatchTransformer
metadata:
name: sas-logon-ha
patch: |-
- op: replace
path: /spec/maxReplicas
value: 2
- op: replace
path: /spec/minReplicas
value: 2
target:
kind: HorizontalPodAutoscaler
version: v2
apps: autoscaling
name: sas-logon-app
---
apiVersion: builtin
kind: PatchTransformer
metadata:
name: sas-identities-ha
patch: |-
- op: replace
path: /spec/maxReplicas
value: 2
- op: replace
path: /spec/minReplicas
value: 2
target:
kind: HorizontalPodAutoscaler
version: v2
apps: autoscaling
name: sas-identities
This shows the pattern to update the required services.
There are several things to consider when it comes to the MAS configuration, including:
Typically, with real-time integration latency is very important. When the SAS Viya platform is integrated into a business process, it is key that the business transactions are not impacted by the SAS platform.
So, while it might not take long for Kubernetes to start a new MAS pod (in human terms), this can still have an impact on the transactions. It is important to note that the MAS pod startup time will be governed by the number of published models. Each MAS pod is running all the published models. Therefore, having multiple MAS pod replicas is important.
The MAS deployment is also controlled by a HorizontalPodAutoscaler definition. Therefore, the best approach to implement HA is to define a HPA patch. The following is an example to configure 3 replicas.
---
apiVersion: builtin
kind: PatchTransformer
metadata:
name: enable-mas-ha
patch: |-
- op: replace
path: /spec/maxReplicas
value: 3
- op: replace
path: /spec/minReplicas
value: 3
target:
kind: HorizontalPodAutoscaler
version: v2
apps: autoscaling
name: sas-microanalytic-score
Another consideration is the pod anti-affinity (podAntiAffinity) definition. This is set by default, using preferred scheduling. Therefore, assuming there are available nodes, Kubernetes should try and separate the running MAS pods. I say “should” as it is a “preferred” (not “required”) scheduling definition.
The same is true for the other stateless services discussed above.
The second consideration, dedicating nodes to running the MAS pods is definitely a way to isolate MAS from other workloads. But this does carry an overhead, it may not be the most efficient use of resources. Especially, if you need two or more nodes to provide available.
The final consideration to discuss is whether to implement guaranteed Quality of Service (QoS).
I have covered this in the post: Using Guaranteed QoS with SAS Viya
To summarise that… Guaranteed QoS provides the highest level of service, and these pods will be scheduled to Kubernetes nodes with sufficient resources, and they are the last to get evicted when a Kubernetes node is under pressure.
Therefore, this can be a good approach when the MAS pods are running on a node with other workloads. It provides a level of protection for the MAS pod(s), providing a good alternative to dedicating nodes.
After updating the kustomization.yaml with the two patch transformers, the following example is from my test deployment running in Azure.
Select any image to see a larger version.
Mobile users: To view the images, select the "Full" version at the bottom of the page.
Here you can see that I now have two replicas for the sas-identities and sas-logon-app pods. The sas-identities pods are running on two different stateless nodes (vmss000001 and vmss000002). The sas-logon-app pods are running on a stateful node and a stateless node.
Finally, you can see the three MAS (sas-microanalytic-score) pods are all on different nodes.
Here you can see that it is a relatively simple process to create a SAS Viya deployment with a targeted HA implementation. This helps to minimize the deployment footprint.
When running on one of the Cloud Providers Kubernetes platforms and using a targeted HA deployment as described here, the availability requirements could possibly be met without the need to implement a “Shared nothing” architecture.
Assuming the models or decision flows are capable of being published as SAS Container Runtime images (see the SAS Container Runtime documentation: Model Score Code and Decision Object Support and Limitations), this makes the model or decision portable and removes the requirement to have a SAS Viya platform at runtime. This also opens up other options for HA. Perhaps that’s a topic for another day.
Thanks for reading and I hope this is helpful…
Michael Goddard
Find more articles from SAS Global Enablement and Learning here.
Good news: We've extended SAS Hackathon registration until Sept. 12, so you still have time to be part of our biggest event yet – our five-year anniversary!
The rapid growth of AI technologies is driving an AI skills gap and demand for AI talent. Ready to grow your AI literacy? SAS offers free ways to get started for beginners, business leaders, and analytics professionals of all skill levels. Your future self will thank you.