CAS Resources Management ! options and recent changes – part 1

2 Likes

CAS (Cloud Analytic Services) is a critical component of the SAS Viya platform. It is usually the primary Analytics engine used by SAS users to run their models and crunch large volumes of data to obtain advanced analytical results.

CAS also underpins many of our products with reporting capabilities, like Visual Analytics, Model Studio, etc…. In such case many users sessions have active CAS sessions and the last thing we want is to have an interruption of our Cloud Analytics Services (which could happen if a CAS pod is terminated) and lose all the user sessions.

It is important to make sure that CAS is properly configured from a Kubernetes perspective so it can use as much resources as needed and is protected against a potential Kubernetes eviction.

In this first part of the blog, we start with a reminder on how the Pods resources are defined and managed in Kubernetes, what are the Pods "QoS" classes and then we focus on how CAS is configured in this regard.

Kubernetes resources requests and limits

As explained in the Kubernetes documentation, you can define, in the pod’s specifications, how much of CPU and memory each container needs (“request”). The kube-scheduler uses this information to decide which node to place the Pod on.

You can also specify a resource limit for a container, and the kubelet (kubernetes agent running on each node) enforces those limits so that the running container is not allowed to use more of that resource than the limit you set.

The kubelet also reserves at least the request amount of that system resource specifically for that container to use. It means that if the request is really high (and close to the node’s maximum capacity), the effective result might be the entire node reserved for just one container (in its pod).

Let’s see an example !

We can run the kubectl command (in the screenshot an alias "k" is set), to extract and display the resources definitions for all the containers of a given pod.

Select any image to see a larger version.
Mobile users: To view the images, select the "Full" version at the bottom of the page.

We can see that for this pod, the CPU and memory requests are respectively 50 millicores (0,05 CPU) and 50Mi (Mebibytes, around 52MB) for the sas-audit container (there is only one container in this pod).

Actually, most of the SAS Viya microservices have these resource settings by default.

It is not a huge resource request, so the pod should be easily scheduled by Kubernetes, unless all available nodes resources are already "reserved".

Now, the limits are set respectively to 500 millicores (0,5 core) and 500Mi of memory. So, once the pod has been scheduled and is running on a node, the kubelet starts to monitor its resources consumption. If more CPU is used by the container, it will be throttled and if it exceeds 500MB of memory utilization, the container will be in danger of being terminated.

Note: Since the command to show the resource specifications of a pod is pretty handy, I have reproduced it below in a form that you can copy and paste, for your own benefit 😊 (just adjust the namespace and pod names).

kubectl -n <NAMESPACE> get pod <PODNAME> -o json | jq ".spec.containers[].name, .spec.containers[].resources"

# extra tip:to get only the information for a given container, simply use an index
# example, to only see the "sas-cas-server" container resource definitions:

kubectl -n <NAMESPACE> get pod sas-cas-server-default-controller -o json | jq ".spec.containers[0].name, .spec.containers[0].resources"

Pods Quality of Service (QoS)

Now, before looking specifically at CAS, there is another important Kubernetes concept to understand: Kubernetes provides different classes of Quality of Service (QoS) to pods depending on what they request and what limits are set for them.

Kubernetes uses the Pod’s QoS class to make decisions about evicting Pods when the node's resources are exceeded. So when several pods have been scheduled on a node that is starting to run out of resources (because the pods are consuming more and more resources) Kubernetes will determine which pods to evict first based on their QoS (in order to protect the node).

The QoS class of the pod is only determined by the resources request and limit values set for its containers.

The table below summarizes the configuration and Kubernetes management of the pods for the 3 QoS classes : "best effort", "burstable" and "guaranteed".

CAS Resource settings

Now that we understand how the Kubernetes pods are configured in terms or resource requests and limits, we can look at how the CAS pods resources are configured.

Let’s review the 3 possible options* for the CAS pod resource settings:

CAS auto-resourcing (default)
Customized values
Initial values

(*) To see how to achieve one or the other configuration option, you can refer to the SAS official documentation and to the SAS Viya 4 Architecture VLE corresponding module.

With the first option, "CAS auto-resourcing" we let the CAS Deployment operator discover the environment (and more specifically the nodes with the "CAS" label) and determine the appropriate CAS pods resource request and limit values.

With CAS resources "Customized values", we disable the CAS auto-resources transformer references in the kustomization.yaml file and use a specific transformer to set the CAS pods resource requests and limits to specific values of our choice. The CAS pods "QoS" depends on the values that are set using the chosen option.

With the last option, we let the CAS Pods start up with the setting defined in the initial resource definitions (that’s what you get when you disable the CAS auto-resourcing and don’t use a transformer to set your own values). The default request values are 250 millicores (1/4 of a core) for the CPU and 2Gi for the memory and there is no limit defined, which places the CAS pods in the “burstable” QoS class.

Note that this is "Initial values" option is purely a PoC / Dev / Test style deployment. For production workloads you’ll want to set an appropriate values with one of the two first options.

CAS auto-resourcing is the recommended choice and is enabled if you apply the initial kustomization.yaml file provided in the official documentation.

Until recently, with CAS auto-resourcing, the values were set in a way that all containers would have the same values for request and limits, which provide Guaranteed "QoS" to CAS pods, (which is the highest Quality of service for a pod).

However, there has been some changes to this strategy and that is the topic of the upcoming second part of this post. 🙂

Find more articles from SAS Global Enablement and Learning here.