Historically, SAS analytic processes have often been dependent on good storage. For the SAS runtime (like sas.exe, SAS Workspace Server, SAS Compute Server, etc.) there was one area in particular that needed performant storage: SASWORK. Generally, we typically care about two things for SASWORK: 1) the size and 2) the speed. We recommend sites setup disk for SASWORK that can deliver I/O throughput rates of 100-150 MB per second per core. And that's just to get the conversation started - in some cases, slower disk might be acceptable, but the SAS runtime can actually handle much faster rates than that.
With that in mind, let's talk about the SAS Viya platform. It relies on storage, too. SASWORK is in there, yes. And even CAS - our high-performance in-memory analytics engine - mounts several storage volumes for a variety of purposes including data source, operational files, caching, backups, and more. Storage is used by many other services in the Viya platform as well.
But now that we live in a Kubernetes world, there are more storage attributes to address than just size and speed. Indeed, we need to tackle these additional attributes early so that we know where to put the size and speed later.
Persistent Volumes (PV) and Persistent Volume Claims (PVC) are Ku bernetes objects that pods can refer to for storage. They're "persistent" in that that they are independent of a pod's lifecycle. A pod can fire up, refer to a PVC to get a P V with storage space from the specified provider and save some data out there, then ter minate. The PVC will remain and the PV will, too. Therefore the files saved by the pod are still available. For certain kinds of pods (usually managed as Stateful Sets), new instances can refer to the files left behind in the PV by re-using the PVC.
Let's look at 3 categories in particular of storage attributes that are relevant to PV and PVC:
Provisioning Type - when is storage defined:
Access Mode - how many services need access to the storage:
Reclaim Policy - what to do with the volume when we're done with it:
If we take those three categories above and look at combinations of their values, then we get 8 possible outcomes. Due to how abstraction works in Kubernetes (PV, PVC, storage class, storage provisioner, storage infrastructure, etc.), we need to understand how these combinations work (and don't).
Static provisioned, Retain, RWO
Static provisioned, Retain, RWX
Static provisioned, Delete, RWO
Static provisioned, Delete, RWX
Dynamic provisioned, Retain, RWO
Dynamic provisioned, Retain, RWX
Dynamic provisioned, Delete, RWO
Dynamic provisioned, Delete, RWX
In the 2018 movie Avengers: Infinity War, Doctor Strange uses the Time Stone to view 14,000,605 possible futures in the fight against Thanos. Well, sometimes it feels like we have the same potential when planning storage volumes for the Viya platform. As they say, every rule has an exception… but in Kubernetes, it seems the exceptions themselves are often the rule.
For example, I've made a couple of references to SASWORK relying on dynamically provisioned, RWO volumes (ideally using local disk), with reclaimPolicy=Delete. But guess what - there's some cool functionality that might affect that. Checkpoint-Restart allows for one SAS runtime to pickup where another left off. This is especially useful for very long-running ETL jobs where having to restart a failed 12-hour job after it completed the first 10 hours brilliantly isn't desired. In order for this to work though, the new SAS process needs access to the files in SASWORK left behind by the old process. This means that, for SAS runtime pods where we want Checkpoint-Restart ability, they should request volumes for SASWORK that are shared (RWX) and reclaimPolicy=Retain. This might involve substantial infrastructure changes, potentially leading to the implementation and cost of a clustered file system. Take care to plan this carefully.
Also, each cloud provider offers a range of storage solutions that can be tiered to deliver increasing performance (with commensurate increasing cost). If we're interested in pursuing maximum performance with efficient cost structure by differentiating offerings so that the Viya platform gets just the right storage at each point, then we can certainly do so… but it will involve a lot of configuration, YAML syntax, and validation effort.
At the other end of the spectrum, we might consider small and one-off environments which are often suitable for demo or education purposes. In those cases, we'd like to minimize storage complexity… ideally with a one-size-fits-all approach that can be used to satisfy all Viya's storage needs.
That goal can be reached somewhat, but it necessitates breaking some important expectations. Consider if we collapse all 8 storage attribute combinations down such that we'll use just 1 storage class to provide Dynamic-Delete-RWX volumes that are hosted in a de dicated NFS Server. Doing this, the Viya platform software will actually deploy and function. The problem is that this silver-bullet approach to storage disregards some documented system requirements, though.
You see, Postgres and other infrastructure services state in their documentation that NFS volumes (which we're using for RWX access mode here) should not be used to host their data. SAS also strongly recommends against NFS-based storage for SASWORK. That's because NFS overhead adds latency in addition to file locking, caching, and other tasks can get in the way when there's a lot of activity happening. This leads to timeouts, race conditions, and other problems.
In other words, we can go too simple with storage and still get a Viya environment that basically functions, however it's not going to be performant for a real-world production environment. This is a larger challenge sometimes, because it's common for us to see something working and then hold it up as an example for reference. We need to take care using overly simplified storage provisioning to understand what its limits and capabilities really are.
We've really just scratched the surface here on the many myriad possibilities of storage assignments for the Viya platform. These are multiplied when expanding our view to include Kubernetes resources and cloud provider infrastructure offerings. It's important, however, to define and provision the correct storage to ensure the resources are appropriate in purpose and function so that Viya can perform at its full capacity.
SAS documentation can help guide us. See SAS® Viya® Platform Administration > Deploy and Update > System Requirements > Hardware and Resource Requirements > Persistent Storage Volumes, PersistentVolumeClaims, and Storage Classes.
And for more information about this and related topics, visit learn.sas.com. Look for the course titled SAS® Viya® Architecture for Practitioners where we cover storage, networking, scalability, availability, authentication, encryption, and many other aspects of the Viya platform.
Find more articles from SAS Global Enablement and Learning here.
The rapid growth of AI technologies is driving an AI skills gap and demand for AI talent. Ready to grow your AI literacy? SAS offers free ways to get started for beginners, business leaders, and analytics professionals of all skill levels. Your future self will thank you.