BookmarkSubscribeRSS Feed

Kubernetes Storage Patterns for SASWORK – part 1

Started 3 weeks ago by
Modified Thursday by
Views 375

The SAS Programming Runtime Environment (instantiated as the SAS Compute Server, SAS Connect Server, or SAS Batch Server) is a core workhorse analytics engine for the Viya platform. It's not an in-memory analytics engine like CAS and so it relies on disk to hold interim outputs at the ready as it crunches through large volumes of data to complete the job. That scratch space is called SASWORK. So, it's smart to address storage performance in terms of raw throughput as well as I/O operations per second to ensure that SAS runtime processing isn't overly constrained by the storage it's using.

 

In order to do that, we need to understand the abstractions offered by Kubernetes and our infrastructure provider of choice to provision the right kind of storage for SASWORK. Let's now look at some storage patterns to understand how this gets done.

 

 

About SASWORK

 

SASWORK is disk space that's automatically used by the SAS runtime as well as programmatically by the user, if wanted. To do the analytics tasks that SAS excels at, it typically performs long, sequential reads and writes of data - very different from the tiny operations of transactional systems. SAS typically recommends throughput rates in the range of 100-150 MB per second per core for the storage backing SASWORK.

 

Also, the actual SASWORK location for a given SAS runtime instance is uniquely named for exclusive use. Files in SASWORK are automatically deleted when SAS terminates normally… however, those files might be abandoned if the SAS process crashes unexpectedly. Extra steps to reclaim disk space are required to clean up those orphaned files in SASWORK.

 

As you'll see in the illustrations below for the SAS runtime instantiated as a SAS Compute Server pod in Kubernetes, SASWORK is configured to use the internal container path of /viya. Our goal is to map that path to an external physical volume that's suitable for the kind of I/O that SASWORK exemplifies. This might be to an arbitrary path for disk attached to the local host machine or to a managed storage service.

 

 

Amazon examples

 

The following illustrations will reference resources in Amazon Web Services to explain some concepts with concrete examples. Generally speaking, the other cloud providers offer similar technology by a different name.

 

Quick reference:

 

  • AWS: Amazon Web Services - provides cloud infrastructure services
  • EC2: Elastic Cloud Computing - the core technology offering from AWS that provides compute capacity in the cloud
  • EBS: Elastic Block Store - provides block-level storage volumes for virtual servers
  • EKS: Elastic Kubernetes Service - a managed service that handles the complexities of running the Kubernetes control plane

 

 

emptyDir - the default

 

Kubernetes explains that an emptyDir is a volume that is created when a pod is assigned to that node - and it's automatically deleted when the pod is removed from the node. This makes it ideal for scratch space, like SASWORK.

k8s-saswork-1a.png

Select any image to see a larger version.
Mobile users: To view the images, select the "Full" version at the bottom of the page.

 

But there's more to it than that. By definition, the emptyDir is assigned a path on the host machine's root volume. This means that if too much data is placed in SASWORK, then the root volume fills up which prevents the OS and other services from running. This is hard to monitor and manage, so we typically recommend avoiding emptyDir for SASWORK in production environments.

 

In AWS, there's another interesting twist. The OS root volume for EC2 hosts used in EKS are provided by EBS - network-attached storage, not a local disk on the host machine. That's because some instance types don't offer local disk at all. You can configure what kind of EBS storage is used for the root volume, but the default nowadays is "gp3" - general purpose volumes that are cost-effective. A performance alternative might be to use "io2" storage instead where you can specify I/O performance factors.

 

Note that when the OS root volume is relied on for SASWORK, that means that a single EBS volume is handling all that activity - so they are competing for I/O access to the storage.

 

 

hostPath - to be avoided

 

There's another way to mount a local directory to your pod: hostPath. While they're simple to implement, the significant security problems inherent to using hostPath means that it's avoided for nearly all use-cases, even for simple demo purposes. (See the Kubernetes documentation for an explanation about the serious privilege-escalation vulnerabilities that hostPath presents.)

k8s-saswork-2a.png

 

The reason why hostPath is attractive is that it provides a simple, direct way to define a directory mount to a locally-attached SSD (or NVMe) drive - in AWS, this is called an Instance Store. Usually, this provides the kind of fast, low-latency storage that enables efficient use of SASWORK and avoids I/O contention on the OS root volume, too.

 

While hostPath is considered ephemeral storage, the files that are written out are not automatically deleted by Kubernetes when the pod is removed from the node (in contrast with emptyDir). For well-behaved use of SASWORK, this is usually okay because the SAS runtime automatically deletes files from SASWORK when it shuts down. However, sometimes the SAS runtime might crash and hence abandon files in SASWORK, leaving them behind to be cleaned up manually later.

 

 

GeV using block storage - pretty good

 

As it turns out, having the automatic deletion of a volume when the pod is terminated is a pretty cool feature - and helps reduce admin workload trying to police and cleanup arbritrary disk space on the Kubernetes nodes. Generic ephemeral volumes are similar to emptyDir volumes in the sense that they provide a per-pod directory for scratch data that is usually empty after provisioning.

k8s-saswork-3a.png

 

Note in this example that we've defined SASWORK to use a GeV with storage in EBS. For this use of EBS, we need to install the AWS EBS CSI Driver. The container storage interface is a form of plug-in technology that Kubernetes uses to work with the desired storage offering. The cool thing to see here is that this sets up an independent EBS volume just for SASWORK - and therefore not competing with the OS root volume for I/O. They each have their own network mount and i/o channel.

 

Going farther, the illustration shows SASWORK using a "gp3" volume which can be set to provide up to 16,000 IOPS and 1,000 MiB/s throughput. However, AWS also offers Provisioned IOPS SSD volumes where you can specify even higher I/O throughput performance. These are referred to as "io2" volumes (or more extreme, "io2 Block Express"). While they can be faster (64,000 IOPS and 4,000 MiB/s) and offer improved durability compared to "gp3", they often cost quite a bit more depending on the performance characteristics you specify.

 

My $0.02 - When it comes to GeV, I prefer to think of them as Generic Ephemeral Volume Claims. That's because a PVC normally has a lifecycle independent of the pod that refers to it. When the pod is terminated, the PVC remains until it's deleted in a separate step. The way GeV work, the PVC itself is automatically deleted by Kubernetes when the pod that requested it terminates.

 

What happens when the PVC is deleted (either way)? If the storage class was configured with reclaimPolicy=Delete, then the associated PV is automatically deleted, too. So, we still end up with ephemeral volumes, but the difference in handling PVC is an interesting twist.

 

 

GeV using local storage - interesting

 

There's another way to define GeV and that's to use local storage (if available). This gives us that high-speed, low-latency local SSD (or NVMe) drive that is great for SASWORK.


k8s-saswork-4a.png

 

And if we elect to use storage that's dynamically provisioned on the local disk, then we get that fully dynamic storage lifecycle - like emptyDir - but without the drawbacks. In other words, avoid using Kubernetes Local Static Provisioner and instead try using a dynamic provisioner like the Rancher project called Local Path Provisioner or OpenEBS Local PV Hostpath.

 

For clarity - and because I was confused by these naming choices, too:

 

Rancher Local-Path Provisioner:
  • operates using the modern Container Storage Interface (CSI) model, not the legacy in-tree provisioner mechanism.

 

OpenEBS Local PV Hostpath:
  • is not based on Amazon Elastic Block Store. Instead, it's an open-source block-based storage project that can work in Kubernetes in any cloud environment.
  • does not provide direct-access to Kubernetes hostPath volumes (mentioned above). Instead, it allows you to define the path on the host where the dynamic volumes are placed.

 

Configuring the SAS runtime to reference a Generic Ephemeral Volume for SASWORK using a storage class that specifies a directory path on the node gives us the ability to use locally-attached disk - not the OS root volume - and have it automatically delete any orphaned files that might be left behind due to a process crash. This keeps the system nice and tidy.

 

 

But wait, there's more

 

There are other solutions to accomplish many of these goals. For example, we've seen SAS Viya used successfully with  the Portwerx storage platform. The main thing to understand is how SASWORK operates and the attributes of storage that are needed to ensure it's fast, efficient, and cost-effective.

 

In the next post, we'll consider the storage attributes that are required by the SAS Checkpoint-Restart functionality. We can't use traditional ephemeral storage for SASWORK in that case because the idea is that a new instance of the SAS runtime wants to pickup where an old instance left off. Instead, we need storage that is accessible from any Kubernetes node which means local storage offerings and single-pod-access volumes are off the table.

 

 

References and Resources

 

In addition to the links above, the GEL team also provides courses on learn.sas.com:

 

 

 

Find more articles from SAS Global Enablement and Learning here.

Comments

Thanks for these detailed explanations wich are essential in practice.  I especially liked the diagrams, self-explanatory, which bring some more clarity to this difficult and quickly evolving field. The link to PortWorx post is non-public or unreachable with only public web access.   

Thanks for the feedback, @ronan! And thanks for the heads-up about the internal link -  if that post gets published externally here, I'll be sure to share it. 

Contributors
Version history
Last update:
Thursday
Updated by:

hackathon24-white-horiz.png

2025 SAS Hackathon: There is still time!

Good news: We've extended SAS Hackathon registration until Sept. 12, so you still have time to be part of our biggest event yet – our five-year anniversary!

Register Now

SAS AI and Machine Learning Courses

The rapid growth of AI technologies is driving an AI skills gap and demand for AI talent. Ready to grow your AI literacy? SAS offers free ways to get started for beginners, business leaders, and analytics professionals of all skill levels. Your future self will thank you.

Get started

Article Tags