Leverage Azure storage to exchange data between Enterprise Guide and SAS Viya (“Dropzone” example)

1 Like

In this ~~post~~ "half post / half hands-on exercise", we explore and share some findings related to the use of "Azure File Shares" for a SAS Viya deployment.

The focus of this post is NOT the usage of "Azure File Shares" for the default SAS Viya platform "RWX" Persistent Volumes but rather its utilization as a "SASDATA" type storage location (for CAS tables and SAS Datasets).

We'll see how to create an "Azure File Share" in the Azure Cloud, how to mount it, how to make it available in Kubernetes and for our CAS and Compute server pods.

Then we'll show how to position this "Azure File Share" as a common "Drop Zone" for both the SAS Viya components (CAS, Compute Server) but also windows clients (such as SAS Enterprise Guide).

Finally we'll have a look at some performance considerations.

As it has become a quite long post, I have added a little "ToC" :

Contents:

The use-case
Create the Azure File Share
Make it available to the SAS Viya Platform (with Azure CSI driver)
- Use static PV provisioning to apply NFS mount options
- Add missing permissions

Make the "Azure File Share" available to the Windows users
- A Technical "deadlock"
- Implementing the "Glue" approach
Performance considerations (read-ahead parameter)
Conclusion
References

I hope you'll enjoy this technical journey… 😊

The use-case

The example that we present here, comes from a "real-life customer scenario". The users preparing the data for the reports are mostly working with SAS Enterprise Guide. They connect to a remote Windows server (Windows Server 2016) where SAS Enterprise Guide is installed and they run their ETL workflows from there. The "output" data is then stored on an NFS file share (NAS) – which serves as a "Drop Zone" for further processing by the SAS Viya platform.

The NFS file share is mounted in the SAS Viya CAS and Compute server pods, so the data can be directly loaded from there into the CAS in-memory tables, then saved back on disk or can be processed by SAS code (mostly to generate various dashboards and reports).

This diagram illustrates this scenario :

Select any image to see a larger version.
Mobile users: To view the images, select the "Full" version at the bottom of the page.

However, the SAS Viya platform is running in Azure, while the Windows server where Enterprise Guide and the NFS server are both "on-premises" (the "on-prem" environment is connected to the Azure Cloud through an Azure VPN).

This setup leads to a degradation of the performances caused by frequent access from CAS to the data stored on the "on-premises" NFS server. As the data are located "on-premise" and the SAS Viya run-time is in the Public Cloud, higher latency is observed on data transfers between the two distant networks.

The development team has requested a new storage location in Azure (with a capacity of 200GB maximum - although the expected use should be below 50GB) to be used as the new "Drop Zone", instead of the "on-premise" NFS server.

The goal is to improve the performances by avoiding to move data back and forth between "on-prem" and Azure environments (triggered by CAS data loading and saving frequent operations...).

The proposed solution is to implement an "Azure File Share" (also accessible from the Enterprise Guide machine) as illustrated below.

While a degradation of the performance is expected during the delivery of the output data during the ETL workflow running in Enterprise Guide, better response times are also expected in the SAS Viya platform where most of the read/write access operations happen.

This scenario may look familiar to you 😊, many customers keep and still use legacy "on-prem" environments while deploying SAS Viya in the Cloud.

Also note that the solution implemented here is just one specific way to address the performance issues for a given situation and context…we'll still have a latency problem because of using distant networks (but it will be more on the "on-premise" activity side rather than in the SAS Viya platform).

Create the Azure File Share

Create the Azure Storage account and the "classic file share"

You can create your "Azure File Share" either from the Azure portal or using the Azure CLI. It is quite easy and well documented.

Note that you currently have the choice between a "classic file share" and the new "Microsoft.FileShares" resource provider (that does not require to create an Azure Storage account).

However, since the new way to create "Azure File share" is still in preview, we opted for the "classic file share".

We just followed the instructions to first create a "provisioned v2" storage account using the Azure CLI with the "PremiumV2_ZRS" SKU.
- The provisioned v2 model lets us independently provision the storage capacity, IOPS and throughput.
- The premium SKU allows to use the NFS protocol (which is required when using CAS and the SAS Compute server).

resourceGroupName="gelato-rg"
storageAccountName="gelatosassdv2share"
region="eastus"
storageAccountKind="FileStorage"

# Valid SKUs for provisioned v2 file shares are 'PremiumV2_LRS' (SSD Local),
# 'PremiumV2_ZRS' (SSD Zone), 'StandardV2_LRS' (HDD Local),
# 'StandardV2_GRS' (HDD Geo), 'StandardV2_ZRS' (HDD Zone),
# 'StandardV2_GZRS' (HDD GeoZone).
storageAccountSku="PremiumV2_ZRS"
az storage account create --resource-group $resourceGroupName --name $storageAccountName --location $region --kind $storageAccountKind --sku $storageAccountSku --output none

Then we followed the next documented set of instructions to create the "Azure File Share" itself :

shareName="reportingdata"
# The provisioned storage size of the share in GiB. Valid range is 32 to
# 262,144.
provisionedStorageGib=200
# If you do not specify on the ProvisionedBandwidthMibps and ProvisionedIops, the deployment will use the recommended provisioning.
provisionedIops=3000
provisionedThroughputMibPerSec=130

az storage share-rm create --resource-group $resourceGroupName --name $shareName --storage-account $storageAccountName --quota $provisionedStorageGib --enabled-protocols NFS

# --provisioned-iops $provisionedIops --provisioned-bandwidth-mibps $provisionedThroughputMibPerSec

In this example, we used the recommended provisioning options (that can be easily changed if needed). Also note that, in the Azure CLI command, we explicitly enabled the NFS protocol (scroll to the left side of the code snippet to check that).

Create the private endpoint

After that, we need to create a network endpoint for our "Azure File Share".

When using an NFS file share, networking configuration is required, it is materialized by this page in the Azure portal:

Either "Service endpoints" or "Private endpoints" can be used.

In our case (for "on-premise" access) we'll use a private endpoint – which also gives us a private IP address (that can be used to configure the NFS mounts).

As noted in the Azure documentation, "a private endpoint is a network interface that uses a private IP address from your virtual network. This network interface connects you privately and securely to a service that's powered by Azure Private Link. By enabling a private endpoint, you're bringing the service into your virtual network".

A DNS record is also required so an entry in the Private DNS zone is also created as part of the private endpoint setup (the associated FQDN is something like <Azure Storage Account Name>.file.core.windows.net).

Disable the "Secure transfer required" option

Something important to do is to disable the "Secure transfer required" option on the Azure storage account.

This setting enforces encryption in transit by requiring SMB 3.x with encryption or HTTPS but it won't allow NFS mounts.

If you try to mount the "Azure File Share" via NFS with this option enabled, you will see a "mount.nfs: access denied by server" error message.

So, we need to disable the option if we want to mount an "Azure File share" in our CAS and Compute server pods using the standard NFS protocol.

Configure NFS root squash

Finally, for security purposes, it is also strongly recommended to set the "root squash" for the NFS File share. It prevents any NFS client to claim the "root" (superuser) account when accessing the files.

Validate the Azure File share

A good way to make sure that everything is in place, to access the newly created "Azure File Share", is to mount it with the NFS protocol from a VM in the same VNET. Since we created our Azure environment with the IaC for Azure project, a jumphost VM has already been provisioned in the same resource group as our AKS cluster (where SAS Viya is running).

Here is an example :

sudo mkdir -p /mount/gelatosassdv2share/reportingdata
sudo mount -t nfs gelatosassdv2share.file.core.windows.net:/gelatosassdv2share/reportingdata /mount/gelatosassdv2share/reportingdata -o vers=4,minorversion=1,sec=sys,nconnect=4

If the mount is successful, nothing should be returned from the command, if you run a "df -h" command you should see something like that :

Great news ! It confirms that our "Azure File Share" was successful mounted with the NFS protocol !

Make it available to the SAS Viya Platform (with Azure CSI driver)

Now, we need to configure our SAS Viya environment, so the 50GB storage required for our "Drop Zone" can be made available for the CAS and Compute server pods.

Use static PV provisioning to apply NFS mount options

Azure recommends specific values for the NFS mount options like nconnect, rsize and wsize for better performance and noresvport for reliability.

Usually NFS shares are made available to CAS and compute by using Viya provided PatchTransformer examples (available in the README files). They are intended to apply the volumes and volumeMount changes directly at the pod level (to reference new NFS volumes).

However, we can not add any "mount" options at the pod's level. In order to set custom mount options (such as connect or noresvport), you must move away from the "direct" Pod volume definition and use PersistentVolume (PV) and PersistentVolumeClaim (PVC) definitions instead. The PV spec does support the mountOptions field (which is not the case for the pod).

Here is how we created the PersistentVolume (PV) in our example:

As you can see, we are also using the Azure provided CSI Driver for the "Azure File Share" (it is ok to use the CSI driver since we are working with Premium storage and the NFS protocol).

We also set the fsGroupChangePolicy to "none" to avoid recursive changes of the PV underlying data when the mounting pod's FSGroup is different. It is a SAS recommendation.

Finally, note that the "reclaim policy" is set to "Retain", which means that even if we delete the PV representation, the underlying physical volume and the data will be kept (we want that in case we have to extend the size of the volume in the future).

We then create the associated Persistent Volume Claim (PVC) :

We reference the previously created PV and set the storageClassName to an empty value (which is a way to tell Kubernetes, "don't dynamically provision this PVC from Storage Class but instead use the static PV").

You may have noticed that we set the storage capacity of the PV and PVC to 150Gi (and not directly to 200Gi). It is a way to implement a first threshold in case the storage space utilization exceeds what is expected 😊

Finally, we can create the PatchTransformer that will be applied to the CASDeployment object:

and to the "SAS launcher" pod Templates:

11_RP_PatchTransformer-for-SASLauncher-with-PVC-1024x487.png

We simply reference our PVC and specify under which path we want the pod to see our "Azure File Share" files ("/nfs/azure/reportingdata").

Add missing permissions

The first time we applied the configuration changes, redeployed and tried to start a new compute server pod, it failed to launch…there was a "FailedMount" error message in the SAS Launcher log :(...

It appears that some Azure permissions need to be changed for the Storage service account.

The following role was given to the AKS Cluster managed identity (gelato-aks-agentpool) at the storage account level (scope).

To do it, you must select the storage account object, go into "Access Control (IAM)", then create a new "role assignment" where you associate the "Storage Account Key Operator Service Role" role to the gelato-aks-agentpool member.

Note that you may need to bounce the csi-azurefile-node-XXX pod on the compute server node, so that the permission is refreshed.

After these changes and a new attempt in SAS Studio, the new NFS mounted directory should finally appeared:

A restart of CAS did not cause any issue, and when executing the "df -h" command directly in the CAS controller pod, we could see the new volume that corresponds to our "Azure File Share":

OK that's all good and we see that we can now use this new storage space that had been made available through the "Azure File Share" for our SAS Compute and SAS CAS Servers !

However, we are only half way there… remember that the customer asked that the "Drop Zone" space should also be accessible for the Enterprise Guide's users running their ETL flows from the Windows Server. We now also need to configure that !

Make the "Azure File Share" available to the Windows users

Originally, the idea was to use the NFS protocol to access from both the SAS Viya platform and our Windows Server (where SAS Enterprise Guide is running).

A Technical "deadlock"

However while trying to mount the "Azure File share" over NFS on our Windows Server, we discovered, the hard way, that we had a compatibility issue there…☹️

While Windows Server 2016 does have a "Client for NFS" feature, Azure Files NFS shares require NFS v4.1 while the NFS client in Windows Server 2016 only supports NFS v2 and v3...It can not speak NFS v4.1…

As an alternative Windows Server can use the SMB protocol to mount the "Azure File Share", however our File Share has already been configured for the NFS protocol and (unlike NetApp), "Azure File Share" services does not support the "dual protocol" feature, it is either NFS or SMB.

At this point there are basically three primary architectural options to solve this problem…

Option 1: Recreate the file share as an SMB share.

Windows Server 2016 can connect natively through SMB 3.0
Kubernetes can mount SMB shares using the file.csi.azure.com driver

Option 2: Use a small Linux VM or a specialized Pod to act as a bridge (The "Glue" approach)

Mount the NFS share to a lightweight Linux VM (or a container).
Re-export that mount point from the Linux VM as an SMB/Samba share.
Windows Server 2016 mounts the Samba share from the Linux VM.

Option 3: Azure Storage Mover / Sync (The "Staging" approach)

Instead of a live "Drop Zone" on the same share, use two separate shares and sync them
- Share A (SMB): Used by Windows Server users.
- Share B (NFS): Used by Kubernetes pods.
Use Azure Storage Mover or a simple rsync cron job on a Linux node to move files from the SMB share to the NFS share every minute.

Option 1 seems to be the best choice…unfortunately, in order to meet the CAS and Compute requirements, we must use NFS (even if Kubernetes could mount the volumes via SMB on linux nodes, it is clearly stated in the SAS Viya documentation that "Azure Files CSI driver is supported only with the NFS protocol" ...a POSIX-compliant file system is required for SAS Compute and CAS servers).

Implementing the "Glue" approach

Since we already have a small jumphost Azure VM (linux) available in our Azure Resources group, we decided to try the option 2 and to basically:

- - mount the "Azure File Share" (through NFS) on the jumphost
  - and then to expose it via SMB to the on-premise Windows 2016 server.

Here is how we implemented this workaround (Note: the screenshots come from our internal setup instructions, the steps are provided here as an example and should be adjusted to match a different implementation):

Make the "Azure File Share" mount persistent on the jumphost.

Install and configure a "samba" server on the jumphost (Samba is the open-source implementation of the SMB protocol on Linux).

Install Samba

Configure and add a "samba share" for the "Azure File Share"
Restart the SMB service

Configure the Access permissions

The goal is to setup folders and permissions to have a common "Drop Zone" folder in which files can be uploaded from the Windows Server and created by SAS Viya users.
We use the samba "force user" and "force group" settings so uploaded files/folders from the Windows Server machine will be owned by glsuser1 (instead of "nobody").
The gelrace group name and GID corresponds to a SAS Active Directory group whose team members working on the data from SAS Viya are part of (we assume that SAS Viya is also connected to the same AD group and mapped to a custom group in the SAS Viya identities configuration).
Extract from the /etc/samba/smb.conf :

Note that these gluser1 and gelrace group must be known locally on the jumphost VM (you can create local users and group if it is not the case).

We then created a "dropzone" folder as glsuser1 and changed the permissions to make it "group writable".

After having restarted our samba server, we can now mount the SMB share in our Windows server using a UNC path and mapping it to a drive letter (like Z:).

At the end our system looks like that :

When files are copied in our "Azure File Share" by the analysts working with Enterprise Guide on the remote Windows 2016 server, they belong to the glsuser1 generic service account.

Performance considerations (read-ahead parameter)

Before concluding this post, I wanted to come back on a specific performance considerations. We already mentioned that some specific options were strongly recommended by Azure to get the best performances when working with NFS mounted "Azure File Shares" (nconnect, rsize, wsize, etc…).

However the post written by Abhilash and Hans-Joachim (in 2024) really stressed out the importance of the read-ahead parameter and explained how the tuning of this parameter can significantly impact the performance when reading and writing files in the NFS share.

Several tests have been performed (with Data Steps, Proc Freq and CAS table loading) to show how the response times could be improved when setting the read-ahead parameter's to the correct value, and the results are shown in tables.

Here is an extract of the post:

And:

Looking at these numbers, it clearly appears that using the read-ahead parameter with the correct value can drastically improve the performance!

But as noted in the post, “Unlike other mount options - which we will discuss as well further down -, the “read-ahead” parameter has to be increased on the OS level and not at the file share level, which makes configuring this setting a bit more complicated in the Kubernetes world.”

The post's reading convinced me that, while quite complicated to implement, I really had to go the extra mile to implement this specific read-ahead parameter tuning in our environment to get the best performance from our "Azure File Share".

However before implementing it, I looked for a way to get the value of the read-ahead parameter (so I could check after and before the configuration change).

Since it is an Operating System level parameter, we need to run commands at the AKS node level. An easy way to do that is to use the kubectl debug command.

Azure provides a little script there to show/set the read-ahead value and my plan was to write the script and run it on one of the AKS node to collect the current value. But in order to write a script, I first had to install a text editor :

I'm copying my own version of the "azure provided" script here as I had to tweak it a little bit to make it work 😊

After I had created the script with vi and ran it, I got this result:

After checking the Azure documentation, it turned out that it was exactly the value that was required! (for our Mounted Filesystem's rsize value).

That's how I learned that Microsoft had already updated the default value of the read-ahead parameter in their AKS VM images.

So there was no further changes to do !

Note : An important condition to have a successful execution of the script is that a compute pod (where the mount has been done) should be running when you run the script (you can open SAS Studio to trigger the startup of a Compute Server pod).

Conclusion

If you have read everything up to this point…Congratulations !

It is a rather long and technical post with many screenshots and very detailed configurations…and I privileged the raw material to the polished form, so I could really share the experience as I went through it and documented it.

While it requires some efforts and several attempts, we can see here that it is possible to achieve sophisticated setup to meet both customers requirements and technology constraints. The configuration of this (GEL internal) environment brough several findings and helped us to better understand some of the aspects of the integration of our SAS Viya platform in Azure.

Hopefully you'll find this article useful and would have maybe even learned a few things (as I sure did) 😊

End result

The files and datasets generated from SAS Enterprise Guide user in their Windows environments are copied in the "Drop Zone" and appears in the SAS Studio view for further processing with CAS and SAS Compute Servers running in the SAS Viya platform.

References

Find more articles from SAS Global Enablement and Learning here.