This post is a follow-up on a previous blog where we explained how recent changes in Control Group v2 management in Kubernetes have impacted the stability of the CAS server in the SAS Viya platform.
In the latest post, we discussed some workarounds, but with the latest versions of SAS Viya there is now a new feature in the SAS software to avoid this issue in most of the cases.
This new feature is called "Backing Store for CAS Memory Allocations" and this post explains how it works and how it can be setup to mitigate the CAS OOM (Out of Memory) issues impact in the Viya environment.
This post contains a lot of details and is fairly technical, so if you are not about to read the whole article, here is a little “TL;DR” summary 😊
A new CAS feature has been introduced from 2024.09 to limit the CAS memory allocation. If it is enabled, when a CAS action uses more memory than a defined limit, it is interrupted with a failure and a meaningful error message, but the CAS session remains active and the Linux OOM killer is not triggered (which avoids the situation where the OOM killer performs a “group” kill of every process in the CAS container, leading to CAS service interruption). It greatly improves the stability and reliability of the CAS Server service.
The technical mechanism used behind the scene to limit the CAS memory allocation (before the OOM kicks in) is to back the memory allocation onto a special Kubernetes volume (emptyDir of “memory” medium – RAM-backed filesystem) – for which we can set a SizeLimit value. When this value (usually set to 80% of the CAS container's memory limit) is exceeded, the CAS action is interrupted but the CAS session and services remain available for the end users. |
The "Backing Store for CAS Memory Allocations" feature was first introduced with SAS Viya stable 2024.09 (which means that it is also included in the latest SAS Viya LTS 2024.09) and instructions where provided to patch or edit the CASDeployment Custom resource to enable it.
Then with the new stable 2024.11 version, new PatchTransformers were added, so it is now possible to manually enable the backing store for CAS memory as part of the initial deployment of SAS Viya (with the proper configuration in the kustomization.yaml file).
The current plan is make the backing store become the default in the next versions (in the first months of 2025).
We know that the CAS_DISK_CACHE can be used to cache CAS tables on disk (as SASHDAT files) using memory-mapped file. CAS can leverage the CAS_DISK_CACHE to quickly and efficiently "swap in", "swap out" memory blocks as needed and hence holding a volume of data that is higher than the amount of physical memory on the machine (Read these nice posts from Rob Colum and Nicolas Robert if you want to know more about how and when the CAS Disk Cache is used)
But in CAS, an action also allocates "resident" (physical) memory in order to process a table, maybe doing things like creating computed columns or views that required RSS memory to hold rows/columns.
For this part of the Analytics processing, the CAS_DISK_CACHE is not used; CAS allocates memory using the Threaded Kernel (TK) and Memory is backed using the "mmap" system call with the MAP_ANONYMOUS flag. With this flag, the pages are only backed by either real memory or the paging files (with MAP_ANONYMOUS the mapping is not backed by any file).
However, in most Kubernetes system, containers do not have a configured paging file, and CAS can only use real memory…It increases the risks of the CAS session process (running in containers) to be terminated by the OOM killer - as soon as the container's defined memory limit is exceeded.
Having session processes forcibly killed is not a great experience for the end users.
As an aggravating factor, we know (from my previous blog) that - from recent Kubernetes versions (1.28) – when a single CAS session process is killed by the OOM killer, all the other processes in the same cgroup(v2) are also killed, which lead the whole CAS server (SMP) or individual CAS worker(s) (MPP) to go down, impacting the CAS service.
While it is understandable that, from time to time a CAS action could fail (because the system is not equipped with enough memory to run it), it is hardly acceptable that the failure of a single individual CAS action causes the entire deployment of CAS to restart and interrupt the functionality of CAS...
That's the problem that is addressed by this new "Backing store for CAS Memory allocations" feature.
It is now possible to enable a "backing store" to support CAS memory allocations and prevent the whole system being impacted by the fact that a single action requires more memory than what has been made available for the CAS container.
With this feature, the CAS Threaded Kernel (TK) is informed that the files in a specified directory can be used to back up most of the memory allocations.
The TK_BACKING_STORE_DIR environment variable is set in the CAS container and points to a specific path where the Threaded Kernel stores the mapping files (ex: /cas/tkMemory ).
The path is mounted in the pod and mapped to a Kubernetes emptyDir volume.
As noted in the official Kubernetes documentation: "The emptyDir is created when a pod is assigned to a node and is initially empty. When a Pod is removed from a node for any reason, the data in the emptyDir is deleted permanently".
By default, the content of emptyDir volumes is stored on the node’s root disk (typically under /var/lib/kubelet ). However, it is possible to set the emptyDir.medium field to "Memory", the documentation explains that Kubernetes mounts a tmpfs filesystem (RAM-backed virtual filesystem) instead of using the disk.
A size limit can then be specified to limits the capacity of the emptyDir volume. That's how we can control/restrict the memory consumption before it is too late...
In the CASDeployment custom resource specification, it would look like that :
The backing store for CAS memory allocations relies on the SizeLimit parameter as THE way to set a control on the maximum amount of memory that a CAS action could use, without involving the OOM Killer.
With the backing store feature enabled: when a CAS action is submitted to CAS, the action starts to map the memory files in the emptyDir and if the size of the mmapped files exceeds the defined SizeLimit (for example 24GB in the previous example), then there will be an individual failure of the CAS action.
The error message shown to the end users (for example in SAS Studio or any other CAS client) will inform them that the CAS action has failed because it ran out of memory.
It will look like this:
Or like that:
With this new behavior, the problem is addressed before involving the OOM killer and seeing it terminate all the nearby CAS processes. It helps ensure that the CAS server and even the CAS session itself suffer less impact from large-memory tasks, allowing the end-users to continue to work with CAS.
The “fail fast” principle is used here to prevent the OOM killer issue and improve the CAS overall stability.
Finally, the release of the files in the backing store follows the same pattern as releasing memory back to the operating system without this feature enabled. When the CAS session process terminates, all of the files used to back the TK memory are released (which ensure there will be no leaked resources).
While most of the use cases benefit from the “Backing Store for CAS Memory Allocations” feature to prevent OOM kills, there are some exceptions.
Not all memory consumption is constrained by the size of the backing store. These include:
So, for example, Python or Java code running inside the CAS session does not go through the CAS TK Memory allocation and as such will not be subject to the memory control via the TK backing store.
The SizeLimit value defines the maximum amount of memory that can be used by CAS actions before causing an "out of memory" failure.
For the CAS backing store to be efficient, it is important that this threshold is lower than the threshold that would trigger the OOM killer…the goal here is to ensure that the session fails and display an “out of memory” error message to the end-user running this action, before the CAS container memory limit is exceeded and all processes are being killed.
Testing, with the default CAS auto-resources configuration, has shown that setting the backing store to on the node seems to be effective in preventing OOM kills in most situations. In situations where the CAS resources request and limit are manually set (usually to places more than one CAS pod on a node), it is recommended to use a similar fraction (80%) of the container memory limit.
In CAS, it is possible to implement resource management through policies that are created with the Viya CLI. You can create up to 5 five "priority-level" policies per CAS server that is used to place space quotas on table data. Each “priority-level-n” is associated to a group of users. If you enable this kind of configuration, it is also possible to align the backing store configuration to define distinct TK Memory allocation limits for each policy.
Finally, note that SAS R&D is also currently researching additional ways to prevent the OOM killer from terminating CAS processes. One alternative workaround is based on the use of a Kubernetes Daemonset to overwrite the cgroups configuration files created by Kubernetes. The issue with the OOM "group-kill" change in Kubernetes is not specific to CAS, other applications have been impacted and a pull request to allow configuration of the "group OOM kill" behavior should be included in Kubernetes 1.32.
When the "Backing Store for CAS Memory Allocations" capability was introduced for the first time in SAS Viya, the official way to enable it was to patch the CASDeployment Custom resource deployment or to directly edit the CASDeployment Custom resource.
But since version 2024.11, there is now an official Kustomize Patch Transformer to apply the changes. A new "Configure a Backing Store for Memory Allocations" paragraph has been added in the "Optional Customization" section of the official documentation.
There are actually 4 distinct transformers to apply depending on the CAS configuration that is in place.
As usual, once you have copied and eventually adjusted the values in the transformer YAML file, don't forget to reference it in the "Transformers" section of your main kustomization.yaml file.
Note that the Kubernetes manifest, generated from the Kustomize build with the updated reference, must be re-applied and that the CAS Server should be restarted too to pick up the configuration change.
Here is an example of what you would see in the CAS pod specification when CAS auto-resource is enabled (no use of CAS Resource Management policies) and when the backing store is enabled with the default transformer ( cas-enable-default-backing-store.yml ).
The first screenshot shows the effect of the CAS auto-resource configuration on an 8 CPU/64GB machine. The CAS container requests and limits are automatically computed by the CAS operator.
Then you can see below the effect of enabling the Backing Store in the CASDeployment CR:
The previous posts on this topic talked about switching from cgroupsv2 to cgroupsv1 as a workaround to avoid the OOM Killer issue in Kubernetes…However the Kubernetes community has decided to move cgroupsv1 into maintenance mode for Kubernetes 1.31, so removing cgroups2 as a solution may be poorly received... that’s another reason to opt for this new "Backing Store for CAS Memory Allocations" feature instead.
Finally, nothing really prevents a specific CAS action from running out of memory at some point. It’s always a possibility… maybe because the code has not been optimized or the dataset size is too large, or simply because running this action on this amount of data requires more memory than physically available in the infrastructure. However, with this new "backing store" feature, this type of situation is reported as an "out of memory" condition and only affect the individual user's session, as opposed to all sessions running in a CAS pod being arbitrarily killed.
That’s why this change is really improving the overall reliability and availability of the CAS Server.
Now, as a little reward to your perseverance in this post reading 😉, you can find below a short 3 minutes video to quickly demonstrate the benefit of using the "Backing Store for CAS Memory Allocations".
Thanks for reading !
Find more articles from SAS Global Enablement and Learning here.
Good news: We've extended SAS Hackathon registration until Sept. 12, so you still have time to be part of our biggest event yet – our five-year anniversary!
The rapid growth of AI technologies is driving an AI skills gap and demand for AI talent. Ready to grow your AI literacy? SAS offers free ways to get started for beginners, business leaders, and analytics professionals of all skill levels. Your future self will thank you.