This is Part 2 of the post on running SAS Viya on a shared Kubernetes cluster. In Part 1 I discussed some of the challenges that can be encountered when deploying SAS Viya in a cluster that has untainted nodes, as well as the main deployment considerations.
In this post I will discuss implementing required scheduling, required nodeAffinity, to force a desired topology when there are untainted nodes in the cluster.
Again, I will share some of the tests that I ran to help illustrate the issues.
To recap, for my testing I created an Azure Kubernetes Service (AKS) cluster with the required SAS Viya node pools, with the labels and taints applied, but they were scaled to zero. In addition to the four node pools that correspond to the standard SAS Viya workload classes (cas, compute, stateful, and stateless) and a ‘system’ node pool for the Kubernetes control plane, I also created an ‘apps’ node pool (without any taints applied).
This was to simulate a scenario where there are other applications running and having several nodes available that didn’t have any taints applied.
To ensure that the SAS Viya pods run on the desired nodes (when there are untainted nodes), ‘required node affinity’ must be used.
I will start by saying the easiest path, the simplest configuration option when running with un-tainted nodes is to just focus on configuring the CAS and Compute pods. This is easier than reconfiguring all the stateless and stateful service to use required scheduling. There is also a patch transformer for CAS supplied in the overlays: require-cas-label.yaml
See the SAS Viya README: Optional CAS Server Placement Configuration
This means that you only need to focus on the sas-programming-environment components (pods).
I discuss enabling required scheduling in the following Post : Creating custom Viya topologies – Part 2 (using custom node pools for the compute pods).
For this test I configured required scheduling for both CAS and Compute, reset the cas and compute node pools to zero nodes, then deployed SAS Viya.
I created the following patch transformers for the sas-programming-environment pods:
To summarize, the configuration the following patch update is applied to update the nodeAffinity for the Compute components:
patch: |-
- op: remove
path: /template/spec/affinity/nodeAffinity/preferredDuringSchedulingIgnoredDuringExecution
- op: add
path: /template/spec/affinity/nodeAffinity/requiredDuringSchedulingIgnoredDuringExecution/nodeSelectorTerms/0/matchExpressions/-
value:
key: workload.sas.com/class
operator: In
values:
- compute
While this worked, it did highlight the problem for the first user to login to SAS Studio, as it is at this point that the first compute node is created. The SAS Studio session timed out waiting for the Compute Server (pod) to download the container images and then start.
I waited for the compute node to fully start then started a new SAS Studio session. This time I got the Compute Server context.
This does beg the question: Should you ever let the compute node pool scale to zero?
Looking at the SAS Viya deployment I still didn’t have any pods running on the stateful or stateless nodes. These node pools were still scaled to zero, all the stateless and stateful pods were running on the ‘apps’ nodes.
But would starting some stateless and stateful nodes prior to the SAS Viya deployment fix this?
At this point I did another test; I manually scaled the stateful node pool to have one node. Before the SAS Viya deployment the AKS cluster had the following nodes.
Select any image to see a larger version.
Mobile users: To view the images, select the "Full" version at the bottom of the page.
You can see that I had three nodes available in the SAS Viya node pools and again I had several ‘apps’ nodes (aks-apps-xxxx-vmssnnnn) available to simulate nodes being used for other applications.
I did this to illustrate the Kubernetes pod scheduling behaviour. Kubernetes (the cluster autoscaler) will not scale a node pool util it is needed, regardless of the node affinity for the pod, if an “acceptable” node is available, in this case one without any taints, the pod will be scheduled there as a first choice.
Hence, at the end of this deployment I still only had one stateful node and no stateless nodes.
Once SAS Viya had deployed, I used the following command to view the pods running on the stateful node (aks-stateful-12471038-vmss000000) and found there were only 8 pods running on the node.
kubectl -n namespace get pods --field-selector spec.nodeName=node-name
The rest of the Viya stateful and stateless pods were running on the 'apps' nodes.
In this scenario perhaps the organisation only wants to dedicate, or reserve, only one node pool to SAS Viya. When sharing a cluster with other applications this is probably the simplest approach to ensure that there are nodes available to meet the requirements for the SAS Viya Compute Server and CAS functions.
I like this configuration as you only need to create one additional node pool in an existing cluster. Then using required scheduling for the CAS and Compute pods it also allows for scaling the node pool to zero.
The problem with scaling the node pools to zero is avoided as when the CAS server starts this will trigger the initial scaling of the shared node pool.
For this test I created a ‘viya’ node pool, with the compute label and taint applied.
To avoid pod drift, the sas-programming-environment pods should be updated to use required scheduling using the standard compute labels and taints, as described above.
Additionally, as the CAS and Compute pods are sharing the same node pool, the CAS server configuration had to be updated to target the shared node pool, the ‘viya’ node pool. For this I used the provided CAS overlay as a template and targeted the workload.sas.com/class=compute label. The tolerations also had to be updated for the compute taint.
The patch transformers for the CAS configuration are shown below. The first patch transformer updates the CASDeployment.
# PatchTransformer to make the compute label required and provide a toleration for the compute taint
---
apiVersion: builtin
kind: PatchTransformer
metadata:
name: run-cas-on-compute-nodes
patch: |-
# Remove existing nodeAffinity
- op: remove
path: /spec/controllerTemplate/spec/affinity/nodeAffinity/preferredDuringSchedulingIgnoredDuringExecution
# Add new nodeAffinity
- op: add
path: /spec/controllerTemplate/spec/affinity/nodeAffinity/requiredDuringSchedulingIgnoredDuringExecution/nodeSelectorTerms/0/matchExpressions/-
value:
key: workload.sas.com/class
operator: In
values:
- compute
# Set tolerations
- op: replace
path: /spec/controllerTemplate/spec/tolerations
value:
- effect: NoSchedule
key: workload.sas.com/class
operator: Equal
value: compute
target:
group: viya.sas.com
kind: CASDeployment
name: .*
version: v1alpha1
The second patch transformer updates the 'sas-cas-pod-template'.
---
apiVersion: builtin
kind: PatchTransformer
metadata:
name: set-cas-pod-template-tolerations
patch: |-
- op: replace
path: /template/spec/tolerations
value:
- effect: NoSchedule
key: workload.sas.com/class
operator: Equal
value: compute
target:
kind: PodTemplate
version: v1
name: sas-cas-pod-template
The final update I made was to adjust the CPU and memory requests and limits for CAS. This was to ensure that there was space available on the shared nodes for the Compute Server pods.
For example, I used nodes with 16vCPU and 128GB memory, then set the CAS limits to 12 vCPU and 96GB memory. I also configured the requests and limits the same to enforce guaranteed QoS. Using guaranteed QoS is important to protect the CAS Server pods when the nodes are busy. It ensures that the CAS pods are among the last to be evicted from a node.
This was my target topology.
To configure the CPU and memory for CAS see the example in sas-bases:
../sas-bases/examples/cas/configure/cas-manage-cpu-and-memory.yaml
When I deployed SAS Viya with a MPP CAS Server, this had the benefit of ensuring that multiple nodes were available for the Compute Server pods. The Compute pods were configured with the resource defaults (CPU and memory requests and limits).
As a side note (running SAS Viya 2024.03), if you inspect the sas-compute-server pod you will see two running containers. The sas-programming-environment container has resource limits of 2 cpu and 2Gi memory, and the sas-process-exporter container has limits of 2 cpu and 4Gi memory.
Once SAS Viya was running, I then started multiple SAS Studio sessions. In the image below you can see that the Compute Server pods are running on multiple ‘viya’ nodes.
As an end user life was good, as my SAS Studio sessions started without any failures. 😊
Sharing the cluster with other applications is possible, but this needs to be carefully planned to ensure the best result is achieved for ALL applications.
As always, it’s important to focus on the requirements for the SAS Viya platform. When the cluster has untainted nodes, you should configure required scheduling to ensure you have an operational SAS Viya platform. The simplest approach is probably just to focus on Compute and CAS.
A key question is: Do the untainted nodes meet the system requirements for SAS Viya?
If not, additional node pools WILL be required to run the SAS Viya platform.
Here I have proposed the concept of sharing a node pool for Compute and CAS, and shown how you could reserve some capacity for the two workloads. You should do some capacity planning and sizing to establish suitable node sizes and the appropriate resource reservations.
But keep in mind, it is possible to have a shared node pool for CAS and Compute and still use CAS auto-resources to effectively dedicate some nodes within the node pool to running CAS.
Finally, I have left you with a couple of questions ponder (from an end-user perspective):
I’m sure this will lead to interesting discussions! Maybe it’s a topic for another day…
Thank you for useful command to see the node labels and suggestion to keep all the workload for compute server in one place! We will consider to try it out in our cluster. Looking forward to next blogs about how many compute nodes do you need or finding answer to questions like: do you need another stateless node or another compute node if there is significant programming workload, however based on past experience there is also a need for stateless/stateful node with more CPU than for compute node. One thing I was missing in the blog is, where can you find the default deployment for stateful/stateless nodes except of site.yaml , with the nodes affinity and tolerations requirements?
Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!
Data Literacy is for all, even absolute beginners. Jump on board with this free e-learning and boost your career prospects.