SAS Viya topologies: sharing a node pool for Compute and CAS

1 Like

This is another post in my SAS Viya Topologies series. This time we will look at using a two-node pool configuration and how to get the desired topology. We will examine sharing a node pool for Compute and CAS processing. For this I tested using both a SMP CAS Server and using a MPP CAS Server.

For this testing I was working in the Microsoft Azure Cloud and using the SAS Infrastructure as Code GitHub project to build the Kubernetes cluster.

Let’s have a look at the details and how to get the desired topology.

Desired topology

As the title suggests, the desired topology was to use two node pool for my SAS Viya deployment. That is, a node pool for the microservices and a single node pool for the Compute and CAS pods. My goal was to use “small” commodity VMs for all services other than the Compute and CAS engines (pods). This is shown in the image below.

Select any image to see a larger version.
Mobile users: To view the images, select the "Full" version at the bottom of the page.

I had an objective of simplifying the deployment topology by using a single node type, node pool, for the “compute-tier”.

The rationale for using a single node pool for the Compute and CAS processing is that they have similar node requirements. Nodes with ample CPU and memory with local ephemeral disk. It is also a recognition that you can share the nodes by letting Kubernetes control the pod scheduling.

Deployment Decision

A key deployment, or architectural decision, is how to implement the single node pool for the compute-tier?

To minimise the “custom configuration” I wanted to make use of the standard SAS Viya workload class labels and taints where possible. So, should you configure the Compute pods to run on the CAS nodes, or is it better to configure CAS to run on the compute nodes?

I have previously written about Creating custom Viya topologies – Part 2 (using custom node pools for the compute pods), in that blog I discussed the need to update the following configuration:

sas-compute-job-config
sas-batch-pod-template
sas-launcher-job-config
sas-connect-pod-template

Two additional considerations that I would like to highlight when moving the compute pods are, firstly the prepull function for the SAS programming environment container image, and a second consideration is whether SAS Workload Management (WLM) will be enabled. WLM requires at least one node to have the “compute” workload class label.

With the above in mind, and based on my testing, the best and/or simplest approach is to configure the CAS pods to run on the “compute” nodes. That is, nodes that have the ‘workload.sas.com/class=compute’ label and taint applied.

Realising the topology – IAC configuration

As I said in the introduction, I was working in the Azure Cloud and used the SAS Viya 4 Infrastructure as Code (IaC) for Microsoft Azure GitHub project to build the AKS cluster. The image below shows the node pool definitions that I used for my testing.

Here you can see the compute node pool definition is using the Standard_E8ds_v4 instance type, this provides 8 vCPUs with 64GiB of memory and 300GB of SSD Temp storage. This will be used for the Compute and CAS pods.

The nodes have the standard labels and taint applied for the compute nodes.

The second node pool is called generic and does not have any labels or taints applied. It is using the Standard_D4s_v4 instance type. It provides 4 vCPUs with 16GiB of memory, it was my “commodity” VM instance. Note, this node pool has “max_nodes” set to 20, this isn’t needed for SAS Viya to run, in fact using this instance type the deployment spun up 6 nodes.

Tip! When using the IAC and defining nodes without any label or taint, you still must specify the “node_labels” and “node_taints” parameters, with null values, as shown above.

Realising the topology – SAS Viya configuration

The advantage of using the standard compute node configuration is that you only need to focus on the CAS configuration. Let’s look at what is required.

The core of the configuration is that CAS pods need to target the compute nodes and must have a toleration for the compute (workload.sas.com/class=compute) taint.

For this I used the require-cas-label.yaml as a template to configure required scheduling to use the compute label. The ‘require-cas-label.yaml’ can be found in the ../sas-bases/overlays/cas-server folder.

I also used this to set the tolerations for the CASDeployment. The following is the configuration that I used.

Line 10 is highlighted and shows the definition for the required scheduling. The example transformer in sas-bases only has the first -op: add statement, which provides the configuration to target CAS nodes, I updated this to target the Compute nodes.

In addition to this update, on lines 16 – 36 you can see the update that I added to replace the tolerations. This configuration also illustrates a change that was introduced at Stable 2023.05 (May 2023), the addition of two new workload classes for CAS.

For this, lines 23 – 29 set the tolerations for the controllerTemplateAdditions and lines 30 – 36 set the tolerations for the workerTemplateAdditions. As you can see the tolerations should be set in three definitions now, not just on the controllerTemplate definition.

Here is the template should you need to copy and paste it.

# PatchTransformer to make the compute label required
# in addition to the azure system label
---
apiVersion: builtin
kind: PatchTransformer
metadata:
  name: require-compute-label
patch: |-
  - op: add
    path: /spec/controllerTemplate/spec/affinity/nodeAffinity/requiredDuringSchedulingIgnoredDuringExecution/nodeSelectorTerms/0/matchExpressions/-
    value:
      key: workload.sas.com/class
      operator: In
      values:
      - compute
  - op: replace
    path: /spec/controllerTemplate/spec/tolerations
    value:
      - effect: NoSchedule
        key: workload.sas.com/class
        operator: Equal
        value: compute
  - op: replace
    path: /spec/controllerTemplateAdditions/spec/tolerations
    value:
      - effect: NoSchedule
        key: workload.sas.com/class
        operator: Equal
        value: compute
  - op: replace
    path: /spec/workerTemplateAdditions/spec/tolerations
    value:
      - effect: NoSchedule
        key: workload.sas.com/class
        operator: Equal
        value: compute

target:
  group: viya.sas.com
  kind: CASDeployment
  name: .*
  version: v1alpha1

In addition to the PatchTransformer above, you also need to set the tolerations for the sas-cas-pod-template. This is done using the following configuration (set-cas-pod-template-tolerations.yaml).

# Patch to update the sas-cas-pod-template pod configuration
---
apiVersion: builtin
kind: PatchTransformer
metadata:
  name: set-cas-pod-template-tolerations
patch: |-
  - op: replace
    path: /template/spec/tolerations
    value:
      - effect: NoSchedule
        key: workload.sas.com/class
        operator: Equal
        value: compute

target:
  kind: PodTemplate
  version: v1
  name: sas-cas-pod-template

The two PatchTransformers shown above form the core of the configuration to use the Compute nodes for the CAS pods.

An additional consideration is whether to use the CAS auto-resources configuration. I don’t recommend doing this for a couple of reasons. Firstly, and most importantly from my testing using the CAS auto-resourcing affect the Compute prepull function from operating.

Secondly, the auto-resourcing is intended to dedicate nodes to the CAS pods, and this configuration is looking to share the nodes (between CAS and SAS programming workloads), so it doesn’t make sense to implement the auto-resourcing. See the Deployment Guide: Adjust RAM and CPU Resources for CAS Servers.

However, there is a final configuration that I would recommend. You should set the resource requests and limits for the CAS pods and implement Guaranteed Quality of Service (QoS).

Implementing Guaranteed QoS provides additional protection for the CAS pods and ensures that they will not be killed by the out-of-memory (OOM) processing. The Compute pods will be evicted from the nodes should an out-of-memory situation occur. It should be noted that the sas-compute pods are transient, this is a normal configuration, not something that is an unintended consequence of using the two-node pool topology.

If a Compute pod gets evicted it just affects one user, while if a CAS pod is evicted it will have an impact on all CAS users (depending on the CAS Server configuration and how the data has been loaded).

To set the CAS pod requests and limits you can use the cas-manage-cpu-and-memory.yaml example in the ../sas-bases/examples/cas/configure folder.

To implement the Guaranteed QoS you set the requests and limits to the same value. For my environment I was using the Standard_E8ds_v4 instance type, this provides 8 vCPUs with 64GiB of memory. For my testing I set the memory requests and limits to 48GiB and the CPU requests and limits to 6. This is shown in the example below.

# This block of code is for adding resource requests and resource limits for
# memory and CPU.
---
apiVersion: builtin
kind: PatchTransformer
metadata:
  name: cas-manage-cpu-and-memory
patch: |-
  - op: add
    path: /spec/controllerTemplate/spec/containers/0/resources/limits
    value:
      memory: 48Gi
  - op: replace
    path: /spec/controllerTemplate/spec/containers/0/resources/requests/memory
    value:
      48Gi
  - op: add
    path: /spec/controllerTemplate/spec/containers/0/resources/limits/cpu
    value:
      6
  - op: replace
    path: /spec/controllerTemplate/spec/containers/0/resources/requests/cpu
    value:
      6
target:
  group: viya.sas.com
  kind: CASDeployment
  # Uncomment this to apply to all CAS servers:
  name: .*
  # Uncomment this to apply to one particular named CAS server:
  #name: {{ NAME-OF-SERVER }}
  # Uncomment this to apply to the default CAS server:
  #labelSelector: "sas.com/cas-server-default"
  version: v1alpha1

Using this configuration will leave 2vCPU and 16GiB of memory for other pods. By default, each compute session will request 50millicores and 300MB of memory.

Finally, the kustomization.yaml needs the following updates to implement the configuration. For my environment I used a ‘cas’ folder under ‘/site-config’ to hold the configuration. The configuration needs to be added to the transformers section. For example.

transformers:
    :
  - site-config/cas/require-compute-label.yaml
  - site-config/cas/set-cas-pod-template-tolerations.yaml
  - site-config/cas/cas-manage-cpu-and-memory.yaml

Looking at the results

I tested using a SMP CAS Server and MPP CAS Server. One of the nice things about the MPP CAS Server deployment was that there were now multiple compute nodes available for the Compute pods. For example.

Here you can see my MPP CAS deployment, a Controller with 4 Workers, are all running in the compute node pool, the compute nodes. Each on different compute nodes due to the CPU and memory resource reservations and pod anti-affinity settings.

To further test the configuration, I started two SAS Studio sessions, you can see that one sas-compute pod started on compute node vmss000001 and the other session started on node vmss000003. This is highlighted in the yellow box.

In this second example, I deployed a SMP CAS Server. The first SAS Studio session is using the same node as the CAS Server (vmss00000g). I then manually scaled the Compute node pool to have two nodes. Once the second node was ready, I started a second SAS Studio session, you can see that it is using the vmss00000h node.

Finally, I wanted to confirm the CAS pod configuration. For this I used the kubectl describe pod command.

Here you can see that the sas-cas-server pods have the requests and limits set as configured.

Conclusion

Hopefully this demonstrates that it is relatively simple process to configure the SAS Viya deployment to share a node pool for the Compute and CAS pods.

I see this type of configuration mainly being used for Visual Analytics deployments supporting a small number of programmers. For large environments supporting many programmers and/or heavy CAS processing, or environment looking to further optimise the deployment then dedicated Compute and CAS node pools would still be used.

It should be noted that configuring the node pools using the IaC is a relatively trivial process, so a valid question is whether the added configuration complexity is worth the effort? I will let you decide that. But if your customer wants to limit the number of node pools, it is possible.

Finally, to recap, for a scenario where a shared node pool is desired for CAS and Compute:

Disable the cas auto-resourcing when sharing the nodes for both Compute and CAS workloads.
Manually configure the CAS CPU and memory requests and limits. I recommend using Guaranteed QoS so that the CAS pods are not killed by any OOM processing.

Find more articles from SAS Global Enablement and Learning here.