A more updated version of this paper can be found here: https://communities.sas.com/t5/Administration-and-Deployment/Best-Practices-for-Using-Microsoft-Azure-with-SAS/m-p/676833#M19680
If you are considering moving your SAS applications to MS Azure, please review the information below to make sure you are choosing the correct MS Azure instances and have prepared them as optimally as possible.
As always, a detailed understanding of how SAS is being used, where the source data resides, given time constraints on running reports/jobs, etc., will help with decisions you make in choosing MS Azure instances and storage.
Let’s start off with the MS Azure instance types. This link brings you to the list of instance types.
We have compiled the following information to help guide your decisions. The memory optimized instances, Ev3 and Esv3 series in particular, tend to be the best for SAS. Let’s walk through important information found on the MS Azure Instance types link above.
The Ev3 and Esv3 series VMs can come configured with either Broadwell or Skylake processors. From the Portal, one is unable to select which chip set that will be used for the VM. After an instance is instantiated, using the lscpu command will list the CPU Model Name for the system. For your SAS Grid compute nodes and CAS Controller/Workers, we recommend that the systems all be the same CPU model. Please work with MS Azure to determine how to make this happen.
Review the information in the “Max uncached disk throughput IOPS/MBps” to see what the maximum MB per second IO throughput is available between the instance you are looking at and Premium Storage. For a Standard_E32s_v3 instance (one of the most popular MS Azure instances that is being used for SAS compute systems), the maximum IO throughput (instance total, not per physical core) is 768 MB per second. For a 16 physical core system, this means 48 MB/sec/physical core IO bandwidth for all the data that will be stored on external Premium Storage.
Review the information in the “Max NICs/Expected network bandwidth (mbps)” to see what the maximum network bandwidth is. For a Standard_E32s_v3 instance, the maximum network bandwidth is 16 Gigabit. Please note SAS recommends a network bandwidth of at least 10 Gigabit between the SAS systems that make up your SAS infrastructure.
Review the “Temp storage (SSD) Gib” and “Max cached and temp storage throughput: IOPS/MBps (cache size in GiB)” to see the size and maximum IO throughput of the local, ephemeral disk. For a Standard_E32s_v3 instance, the maximum size of the internal SSD that could be used for temporary SAS file systems (SAS WORK/UTILLOC or CAS_DISK_CACHE) is 512 GB and the maximum IO throughput is 512 MB/sec (32 MB/sec/physical core). This is not a large amount of capacity space and is at a lower IO bandwidth than is recommended by SAS, so you will probably not want to use it for temporary SAS file systems. This relegates more IO pressure to the external Premium Storage that also has a cap on its IO throughput – see number 2) above. Note that local, ephemeral storage must be used as separate disks and cannot be striped together.
Please note you can utilize MS Azure’s Utilize Constrained Cores to reduce the number of vCPU’s presented to the OS of an instance. This would turn the above Standard_E32s_v3 from a 16 physical cores system to an 8 physical cores system, effectively doubling the IO bandwidth per core that is listed above. This will, in turn, bring the IO throughput per core closer to minimum recommended for SAS workloads. Details on this feature can be found here.
Let’s talk about setting up the network.
To achieve optimal network bandwidth, Accelerated Networking must be enabled. Accelerated Networking requires RHEL 7.4 or higher.
To validate that you have Accelerated Network enabled, please run the following commands and ensure your output looks like the output on this web site.
lspci
ethtool -S eth0 | grep vf_
uname -r
In addition to Accelerated Networking, SAS needs to be on an isolated cloud VNET, Resource Group, etc. It should “share nothing” with other customer infrastructures. The exception to this rule is you would place the instances for your shared file system and RDBMSs dedicated to SAS on this VNET as well.
Deploy on single VNET and Subnet resources specifically created for this deployment. Do not utilize any inspection, tracing features etc., on the VNET.
To get optimal network connectivity, all components need to be in the same Azure Placement Proximity Group within the same Availability Zone within the same Region. https://docs.microsoft.com/en-us/azure/virtual-machines/windows/proximity-placement-groups-portal
And finally, let’s discuss configuring the external Premium Storage. Like the instance types, there is a maximum IO throughput per Premium Disk. These values can be found on the “Throughput per disk” row of this table. Multiple Premium Disks, enough to meet or exceed the “Max uncached disk throughput IOPS/MBps” of an instance, should be attached the instance. These disks should be striped together to create a single file system that can then simultaneously utilize the full throughput across all the disks.
When setting up your storage, you will be prompted for setting a Storage Caching value. Please set ReadWrite for your operating system storage and ReadOnly for your SAS data disks.
To summarize the above, the following would be a good example configuration for a SAS compute node (for either SAS 9.4 or SAS Viya).
Standard_E32s_v3 with four P30 Premium Disks for a total of 4 TBs of persistent disk space. If more disk space is needed, then look at larger Premium Disks. Remember the maximum IO bandwidth to the E32 instance is 768 MB/sec. The internal 512 GB drive can be used for temporary file systems, but it cannot be made any larger.
Standard_L32v_v2 with three P30 Premium Disks for a total of 3 TBs of persistent disk space. If more disk space is needed, then look at larger Premium Disks Remember the maximum IO bandwidth to the L32 is 640 MB/sec. This instance has four 1.92 TB NVMe drives that can be used for temporary file systems.
In conclusion, there are many resource and configuration settings to check within MS Azure, in order to configure an instance to meet the needs of your SAS application. It is possible you may have to use an instance type with more cores than needed (with or without a constrained core count) in order to get the commensurate IO throughput required by your application.
... View more