Last Updated: 10DEC2020
When considering deploying SAS 9.4 (SAS GRID or SAS Analytics Pro) in the Microsoft (MS) Azure Cloud, Azure NetApp Files (ANF) is a viable primary storage option for SAS GRID clusters of limited size. Given the 100MiB/s throughput per physical core SAS recommendation, SAS GRID clusters using an ANF volume for SASDATA (persistent SAS data files) are scalable to 32 physical cores across two or more MS Azure machine instances. Cluster sizes are predicated upon the architectural constraint of a single SASDATA namespace per SAS cluster and the available single Azure NetApp Files volume bandwidth. The core count guidance will continuously be revisited as Azure infrastructure (compute, network and per file system storage bandwidth) increases over time.
Testing has been completed using volumes accessed against Azure NetApp Files via NFSv3. SAS does not recommend NFSv4.1 currently for SAS Grid deployments.
It is the intent of this paper to provide sizing guidance and set proper expectations to be successful when deploying Azure NetApp Files as the storage behind SAS 9.4.
Azure NetApp Files Per Volume Expectations:
It has been tested and documented that a single Azure NetApp Files volume can deliver up to 4,500MiB/s of reads and 1,500MiB/s of writes. Given an Azure instance type with sufficient egress bandwidth, a single virtual machine can consume all the write bandwidth of a single Azure NetApp Files volume. With that said, no single virtual machine, regardless of virtual machine SKU can consume all the read bandwidth of a single volume.
The main shared workload of SAS 9.4 – SASDATA – has an 80:20 read:write ratio and as such the important per volume numbers to know are:
80:20 workload with 64K Read/Write: 2,400MiB/s of read throughput and 600MiB/s of write throughput running concurrently (~3,000MiB/s combined)
The throughput numbers quoted above can be seen in the aforementioned documentation under NFS scale out workloads – 80:20.
Please note, SASWORK (temporary SAS data files) that have a 50:50 read:write ratio should not be placed on Azure NetApp Files volumes at this time.
Which Version of RHEL should be used?
As SAS stated in the Best Practices for Using MS Azure with SAS paper, the E64-16ds_v4 and E64-32ds_v4 MS Azure instances are recommended for SAS 9 providing the best overall SAS experience. Based on this, the following Azure NetApp Files relevant performance guidance is provided at a high level:
SAS/ANF sizing guidance:
If using a RHEL7 operating system, the E64-16ds_v4 is the best choice based upon the 100MiB/s per physical core target for SASDATA.
E64-16ds_v4 – 90 –100MiB/s per core
E64-32ds_v4 – 45-50MiB/s per core
If using RHEL8.2, using either the E64-16ds_v4 or E64-32ds_v4 are viable though the former is preferrable given the 100MiB/s per core target for SASDATA
E64-16ds_v4 – 150-160 MiB/s per core
E64-32ds_v4 – 75-80 MiB/s per core
If using RHEL8.3, both the E64-16ds_v4 and the E64-32ds_v4 are likely fully acceptable given the per core throughput target (pending SAS internal validation performance runs to fully vet this number)
Early validation indicates approximately 3000MiB/s of Reads
Results will be posted once validation is complete
The above is based on below:
Red Hat Enterprise Linux, (RHEL) is the distribution of choice for SAS customers when running SAS 9 on Linux. Each of the kernels supported by Red Hat have their own unique bandwidth constraints in and of themselves when using NFS.
Testing has shown that a single RHEL 7 instance is expected to achieve no more than roughly 750-800MiB/s of read throughput against a single storage endpoint (i.e. against a network socket) while 1500MiB/s of writes are achievable against the same, using a 64KiB rsize and wsize mount options. There is evidence the aforementioned read throughput ceiling is an artifact of the 3.10 kernel. Refer to RHEL CVE-2019-11477 for detail.
Testing has shown that a single RHEL 8.2 instance with its 4.18 kernel is free of the limitations found in the 3.10 kernel above, as such 1200-1300MiB/s of read traffic using a 64KiB rsize and wsize mount option is achievable. With that said, expect the same 1500MiB/s of achievable throughput as seen in RHEL7 for large sequential writes.
A single RHEL 8.3 instance has not yet been tested by SAS for SAS 9.4 workloads, with that said it is understood that with the nconnect mount option new to the RHEL8.3 distribution somewhere around 3,000MiB/s reads throughput are achievable from a single Azure NetApp Files volume is likely. Expect no more than 1,500MiB/s of writes to a single volume even with nconnect.
File System Mount Options
SAS recommends the following NFS mount commands for NFS shared file systems being used for permanent SAS DATA files:
bg,rw,hard,rsize=65536,wsize=65536,vers=3,nolock,noatime,nodiratime,rdirplus,tcp
Capacity Recommendations
Please see the following Azure NetApp Files performance calculator for guidance when sizing SASDATA volumes. As volume bandwidth is based upon volume capacity, and as capacity cost is based upon which service level is selected, and as service level selection is based upon capacity versus bandwidth needs, determining which service level can be somewhat complicated on your own. Using this calculator, enter data as follows:
Volume Size: <Desired Capacity>
I/O Size: 64KiB Sequential
Read Percentage: 80%
Throughput: <Desired Throughput considering 100MiB/s per core>
IOPS: 0
The readout at the bottom of the screen advises capacity requirements at each service level and the cost per month thereof.
Throughput: This is the bandwidth of the volume based on the workload mixture. For an 80% 64KiB sequential read workload, 3096MiB/s is the anticipated maximum.
IOPS: This is the number of IOPS the volume will deliver at the above throughput target.
Capacity Pool Size: A Volumes capacity is carved from a capacity pool. Capacity pools are sized in 1TiB increments.
Volume Size: This is the amount of capacity needed by the volume at the given service levels to achieve the required throughput. Volume capacity (reported in GiBs) may be equal to or less than capacity pool size.
Capacity Pool Cost (USD/Month): This is the cost per month of the capacity pool at the given size.
Volume Show Back (USD/Month): This is the cost per month of the capacity for the volume at the specified capacity. Charges are based on capacity pool sizes; the volume show back shows that part thereof of that cost.
Note: that the user experience will be the same regardless of which service level is selected.
Further control costs using the concept of volume shaping with Azure NetApp Files. Two dynamic options are available to customers to influence performance and cost.
Dynamically resize a volume and capacity pool
Dynamically change the service level of a volume
Other Tuning Guidance
Mellanox Driver Tuning:
Mellanox drivers are used for accelerated networking on Azure Virtual machines. Mellanox recommends pinning network interfaces to the lowest numerical NUMA node to get the best possible user experience. In general, NIC’s should be pinned to NUMA node 0, in Azure, considering hypervisor logic, that may not be the ideal configuration. The Eds_v4 SKU has been found in general only nominally susceptible to this issue, with that said benefits may be found pining the accelerated networking interface the most appropriate NUMA node.
In addition to remapping the accelerated networking interface, additional benefit may be found in setting the number of tx/rx queues for the accelerated networking interface to no more than the number of logical cores associated with a single NUMA node. By setting the tx/rx queue count equal to or less than the core count per NUMA node, you avoid queue selection wherein the queue is not resident to the most appropriate NUMA node. On some systems, the maximum tx/rx queue count is less than the core count for a single NUMA node – only on these systems should the tx/rx queue count be set less than the logical core count per NUMA node.
Please download the following code written by Azure engineering to correctly identify and set the accelerated networking interface and tx/rx queue count on each worker node. This script should be executed on VM start up. All that is required is that at least one NFS mount is already in place.
The script is run as such: ./set_best_affinity.ksh </nfsmount/file>
RPC Slot Table Tuning
RPC Slot Table refers to the maximum allowed threads on a single TCP connection that is allowed by the NFS client and server. These values are controlled through sunrpc configuration on NFS clients. The latest NFS client versions default to a dynamic slot table value of 65536, which means that the client attempts to use as many slot tables as it can in a single TCP connection. Azure NetApp Files, however, supports 128 slot tables per TCP connection. If a client exceeds that value, Azure NetApp Files enacts NAS flow control and pauses client operations until resources are freed up. As a best practice, set the slot table values on NFS clients to a static value no higher than 128. To ensure the best possible SAS storage performance with the NFSv3 protocol, please add the following tunables to /etc/sysctl.conf and then update sysctl by running sysctl -a
sunrpc.tcp_max_slot_table_entries=128
sunrpc.tcp_slot_table_entries=128
Run the above before mounting NFS volumes.
NOTE: The slot table values used above may not be optimal for the RHEL8.3 clients when using the nconnect NFS mount option. More information on this tuning parameter for RHEL8.3 will come later.
Machine Placement:
Use Proximity Placement Groups to co-locate your Azure instances within the same data center and under a common router to:
Reduce intra-SAS node network latency
Provide a similar network latency between each SAS node and Azure NetApp Files
NOTE: At this time, Proximity Placement Groups have no bearing on the relative location of Azure Instances compared to Azure NetApp Files storage, updates will be made should this change.
... View more