As a Deployment engineer - about to start the installation of a new Viya environment in Kubernetes - or as a Technical Architect - working on the storage design for a Viya project - you may hear, or be questioned by your customer about something called the "CSI drivers"...
When searching in the SAS Official documentation, you could be surprised to see no list of officially supported CSI drivers for Kubernetes! It is because SAS can unfortunately not test each and every technology associated to Kubernetes or Containers in every scenario of a SAS Viya deployment.
As my colleague, Rob Collum recently said in one of our team meetings : "Some things are tested and supported by SAS and some other things… can possibly work"…his remark was spot-on : a lot of technologies used in the Cloud or in Kubernetes platforms can be used and are likely to work with SAS Viya, but it does not mean that they have been officially tested by SAS for every kind of SAS Viya deployment…
However, what we can do, as SAS professionals is to share the experience and the lessons learned in the field where customers have implemented proven technologies, even when they were sometimes not explicitly listed as fully tested and supported in the SAS Documentation.
There are several examples of that: Customers deploying Kubernetes platform under limited support (Tanzu, Rancher, etc.), customers using new network or storage container integration technologies, etc.
In this particular post, we discuss and share some feedback on the usage of the Container Storage Interface (CSI) drivers.
In order to expose external storage to the containers running in the pods, Kubernetes initially came up with an "in-tree" volume plugin system (including things like awsElasticBlockStore, azureDisk or cinder).
But the code was part of the Kubernetes system which caused friction over time…it forced storage vendors to align with the Kubernetes release system, and caused reliability and security issues in the core Kubernetes binaries (with a code that was difficult to test and maintain for the Kubernetes maintainers).
To address this challenge, the CSI (Container Storage Interface) was developed as "a standard for exposing arbitrary block and file storage systems to containerized workloads on Container Orchestration Systems (COs) like Kubernetes".
Or simply put, a CSI driver is a standard that allows Kubernetes to interact with storage systems.
Nowadays, most of the original volume plugins (awsElasticBlockStore, azureDisk or cinder) are deprecated and the official Kubernetes documentation recommends to use the "out-of-tree" volume plugins implementing the CSI.
Once installed in the cluster, new available storage classes appear in Kubernetes and can be referenced by the SAS Viya Persistent Volume claims, volumeClaimTemplates or Custom resources.
In the Cloud Managed Kubernetes environments (AKS, EKS, GKE, etc.) the Cloud CSI drivers maybe be pre-installed and the associated storage classes are already available.
Cloud providers provide CSI drivers for a better integration with their storage managed services. In the "on-premise" world, Storage systems vendors provide CSI drivers for a better integration with their SAN or NAS storage systems.
The usage of the NFS subdir provisioner is generally more flexible than the CSI drivers and its usage is currently encouraged both in our official documentation and in the SAS "Deployment as Code" GitHub project (aka DAC).
However, we see more and more customers expressing concern about using "open-source community provided" software in their production environments and require the use of official Cloud supported CSI drivers instead.
In addition of the Cloud vendor support, there can be other benefits such as TLS encryption in-transit or at-rest, automatic reprovisioning when additional space is required, advanced integration with the Cloud file services (snapshots, backups, etc.).
The CSI drivers are mentioned in only two specific locations of the SAS official documentation.
While the SAS documentation does not provide a formal list of supported CSI drivers, there is a warning in this section against the usage of specific CSI drivers.
Two specific CSI drivers that do not meet the CAS and the SPRE pods requirement for a fully POSIX-compliant system, are listed:
These warnings come from real life experience where customers have tried to use these CSI drivers in the field and faced issues during file access operations with CAS, the Compute servers or the internal crunchy PostgreSQL.
As an example read about two SAS colleagues' experience with the aws-efs-csi-driver…) in the extract below from a SAS Internal blog :
Using the aws-efs-csi-driverIn our IAC for AWS, the nfs-subdir-external-provisioner is used when EFS is the desired persistent storage. However, when customers aren’t using our IAC, the aws provided efs csi driver is commonly used instead. Honestly, it makes sense logically, but there is of course a gotcha when this choice is made. While EFS claims to be POSIX compliant, it seems the provided csi driver didn’t get the memo. When using the EFS CSI driver, there is no way to alter ownership of files or directories in PVCs. The chown/chgrp command will fail as described here. To add to the fun, the issue has been marked as closed and not planned… This often presents itself as: rsync: chown … failed: Operation not permitted (1) For now, our best recommendation is to avoid use of the aws-efs-csi-driver and instead use the nfs-subdir-external-provisioner as the IAC does. |
But as noted in the documentation, and explained in the post’s introduction: "This is not an exhaustive list of storage options that do not fulfill SAS requirements. Many storage options are available across all the supported cloud platforms and Kubernetes distributions. SAS cannot test with all of them."
Now we have to be careful…these known limitations, listed above, DO NOT MEAN that Azure Files or Amazon EFS storage can never be used with SAS Viya…
The devil is in the detail…and yes I know, it is complicated…
But for example:
So, when considering the storage options for a Viya environment, it is important to take several things into account :
The other location where the CSI driver is mentioned in the SAS Official documentation is in the "Cluster requirements for AWS" section.
Instead of a limitation, this time, there is a requirement to have the AWS Elastic Block Store (EBS) CSI driver installed in the cluster.
As explained in this blog post from Rob Collum, the reason of this requirement is that since EKS 1.23, without a specific installation of the EBS CSI driver inside the cluster, it is not possible to use the default AWS gp2 or gp3 storage classes.
These default storage classes are recommended for the SAS Viya volumes that require RWO access (such as OpenDistro, Redis, RabbitMQ, etc.…).
At this point, and to avoid further confusion, let's look at a small table listing the most common Cloud vendors CSI drivers.
Cloud Provider |
CSI Driver |
Storage |
Storage type |
Access mode |
AWS |
efs.csi.aws.com |
EFS (Elastic File storage) |
NFS managed service |
RWX |
AWS |
ebs.csi.aws.com |
EBS (Elastic Block Storage) |
Managed disk (Block storage) |
RWO |
Azure |
file.csi.azure.com |
Azure Files Premium |
NFS managed service (NFS support) |
RWX |
Azure |
disk.csi.azure.com |
Azure Disks |
Managed disk (Block storage) |
RWO |
|
filestore.csi.storage.gke.io |
Google Filestore |
NFS managed service (NFS support) |
RWX |
|
pd.csi.storage.gke.io |
Google Persistent Disks (standard-rwo) |
Managed disk (Block storage) |
RWO |
While you can use the Azure Files CSI driver (assuming you use Azure Files premium with the NFS support), there is an additional gotcha that you’ll need to know if you want to use it for your SAS Viya platform dynamically provisioned volumes...
Back in 2023, a SAS colleague explored the use the Azure Files Premium CSI driver and provided guidance on how to implement it with the NFS protocol and the nconnect mount option.
He wrote an internal blog that shows how you can use the Azure Files storage for your SAS Viya dynamically provisioned volumes during the deployment and provision them with the Azure Files CSI driver and that it all works perfectly.
…HOWEVER…
In this case, for each dynamically requested Viya volume, the use of the CSI driver triggers the provisioning a distinct new Azure File Share.
Now the problem is that, as of today the minimum size for Azure Premium files share is 100GB
The result is that the provisioned capacity in Azure is way higher than what is effectively needed…Some volumes only require 1GB (or even less), but a 100GB file share is provisioned in Azure for each of them.
The screenshot below compares the required capacity with the provisioned capacity.
Select any image to see a larger version.
Mobile users: To view the images, select the "Full" version at the bottom of the page.
So while it is technically possible to use the CSI driver here, the cost is very high and a lot of space is wasted. This situation was recently faced at a customer site by a SAS colleague in the AP region.
A first workaround, that the customer tried, was to set the shareName attribute in the Azure Files storage class. When this attribute is set, all volumes that use the azurefile-csi storage class consolidate their content in the same Azure Files share (so we don't need to create one file share for each PVC).
For example, if we have the following 9 PVCs:
…then the content of all the corresponding PV is created at the root level of a single file share, as below:
… which poses a risk of data contention and potential overwrites due to the lack of directory isolation.
This shareName attribute has been tested by the SAS R&D who confirmed that it was not suited for dynamic volume provisioning. Using this setting with Viya could lead to data corruption issues as the Viya components, which request storage through distinct PVCs, expect distinct volumes to be provisioned.
As a result of this experience and test, a new section Guidance When Using Microsoft Azure Files was added in the official documentation.
The official workaround, in order to use Azure Files for the dynamic provisioning of the RWX-access based Viya platform volumes, is to use an NFS provisioner. The Kubernetes NFS Subdir External Provisioner can be used to create a custom storage class based on an existing Azure Premium Files share. The NFS provisioner can create distinct subdirectories for each Viya volumes under the Azure file share. Note that (at the time of this post write up), this NFS subdir external provisioner is used for various NFS -based storage types by the Deployment as Code project (aka "viya4-deployment").
This "over-provisioning" issue when using the CSI driver for dynamic provisioning is not specific to Azure, the same problem has been seen in the past with the Google File Store.
However while the GKE Filestore CSI driver version 1.26 and earlier only allowed for 10 shares at 100GB minimum, things have been improved with the enterprise-multishare-rwx GKE storage class, where the minimum size has been reduced to 10 GB.
While there might be a bit of extra provisioned space, the overhead is minimal.
While using CSI drivers can cause issues, it is also something that is expected to be required by more and more customers…as it often provides a better integration with the managed storage services and extends the Kubernetes capabilities (encryption, snapshot, DR ,etc.).
But remember that there is no list of SAS officially supported CSI drivers...what we have in the documentation today is more a list of CSI drivers that SHOULD NOT be used for specific volumes. The cost efficiency also depends on the kind of volumes for which you would use the CSI driver. For example Azure Files CSI driver, which is supported with the premium SKU (NFS support) is probably not cost efficient for all the Viya platform dynamically created volumes, but it may be cost efficient for the SASDATA or CASDATA static volumes...
Each volume comes with its own specific requirements (access mode, capacity, latency, security, POSIX compliance, data protection, etc.…). Having multiple storage classes is a recommended practice as it allows to associate the right kind of storage technology and device to each Persistent volume.
The topic of storage in Kubernetes is complex and the technologies are evolving rapidly…Some of the issues or limitations discussed in this post may disappear in the future, but other issues could arise when untested scenarios are implemented…
Here are two examples :
Finally while it would be impossible to test, support and list all possible CSI drivers implementation with the SAS Viya platform, it is assumed that the CSI drivers can be used for most of the SAS Viya volumes in general. However in case new problems or limits with specific CSI drivers are discovered in the field and reported, they can be reproduced and assessed by SAS and potentially documented to increase awareness across the SAS Technical community.
That’s it for today!
Thanks for reading!!!
Find more articles from SAS Global Enablement and Learning here.
Save $250 on SAS Innovate and get a free advance copy of the new SAS For Dummies book! Use the code "SASforDummies" to register. Don't miss out, May 6-9, in Orlando, Florida.
The rapid growth of AI technologies is driving an AI skills gap and demand for AI talent. Ready to grow your AI literacy? SAS offers free ways to get started for beginners, business leaders, and analytics professionals of all skill levels. Your future self will thank you.