Last Updated: 12APR2022
Information added on 12APR2022: Information about new Ebds_v5 instances.
This post discusses specifics for running SAS® (either SAS 9.4 or Viya 3.x) in the Microsoft (MS) Azure public cloud. Please review the SAS® Global Forum 2020 paper “Important Performance Considerations When Moving SAS® to a Public Cloud” for critical information that we will not cover in this post.
To maximize the guidelines in this post, you need to understand the compute resources (cores, memory, IO throughput and network bandwidth) needs of your SAS applications. If you know this information, then you can override the generic IO throughput recommendations discussed in this post.
Please remember that most public cloud instances list CPUs as virtual CPU(s). These CPUs might be hyperthreaded (two threads per physical core). You need to understand if the vCPU includes hyperthreads so that you can ensure you have the correct number of physical cores for SAS. To convert Intel vCPUs to physical cores, divide the number of hyperthreaded vCPUs by 2.
In addition to the information about Azure instances types, storage and networking, please follow the best practices in the “Optimizing SAS on RHEL (April 2019, V 1.3.1 or later)” tuning guide. The information in the “2.4.4.4 Virtual Memory Dirty Page Tuning for SAS 9” section on page 17 is essential.
Azure instance types. This link brings you to the list of instance types. Read the description carefully to thoroughly understand what compute resources are available with each instance.
If the instance type contains multiple processor models – such as the Esv3 series which can be either Broadwell, Skylake or Cascade Lake processors – you need to confirm that each instance is using the same Intel processor since you are unable to select the chip set that will be used for the VM from the Portal. After an instance is instantiated, using the lscpu command will list the CPU Model Name for the system.
For SAS Grid compute nodes and CAS Controller/Workers, we recommend these systems all be the same CPU generation. This will ensure you get consistent performance overall rather than from the slowest and oldest CPU instance. Please work with your Microsoft account team to determine how to make this happen. Also, we strongly suggest investing in Unified (a.k.a. Premier) Support when deploying SAS in Azure.
General Tuning Guidance
There are two workarounds to resolve the issue. Add either of the following options to Grub and reboot the machine.
Network
To validate that Accelerated Network is enabled on a linux instance, please run the following commands and ensure your output looks like the output on this web site.
External Storage
To achieve the most IO throughput for SAS, please make sure that you follow the best practices in the “Optimizing SAS on RHEL (April 2019, V 1.3.1 or later)” tuning guide. The information in the “2.4.4.4 Virtual Memory Dirty Page Tuning for SAS 9” section on page 17 is essential.
The following architecture recommendations cover scale-up scenarios. Scale-out recommendations will follow later, pending validation.
When creating disk storage, you will be prompted for setting a Storage Caching value. Please set the following based on the type of files that will be used by these disks:
* this value was changed on 07DEC2021 after additional testing .
With RHEL 7.x distribution and 3.x kernel testing has shown that leaving the virtual-guest tuned profile (vm.dirty_ratio = 30 and vm.dirty_background_ratio = 10) achieves the best IO throughput when using Premium Storage.
As a reminder, SAS temporary files and directories such as SAS WORK, SAS UTILLOC and CAS_DISK_CACHE should be placed on storage with the highest proven throughput possible. Today that usually means Premium Storage or the instance’s local SSD.
Reference Instances for SAS Compute Nodes
To summarize the above, the following are good example configuration for SAS 9.4 or SAS Viya 3.5 compute nodes.
Standard_E16bds_v5 or E32bds_v5 specs for this system: recommended instances - but may not be available everywhere since these are newly released.
Standard_E64-32ds_v4 or E64_16ds_v4 specs for this system: recommended instances
Standard_E32s_v4 - specs for this system:
Standard_L32s_v2 - specs for this system:
Conclusion
There are many resources, configuration settings and constraints to check within Azure to configure an instance to meet the needs of your SAS application. It is highly likely you may have to provision an instance with more physical cores (with or without a constrained core count) in order to get the commensurate IO throughput required by your application. Likewise, you may also have to over-provision storage capacity to achieve the IO throughputs needed for your SAS application.
It is possible you may have to use an instance type with more cores than needed (with or without a constrained core count) in order to get the commensurate IO throughput required by your application. And that you may have to setup more storage capacity in order to get the IO throughput than you need.
As always, there are cost versus performance choices. These selections need to be based on your SLAs and business needs for SAS applications running in Azure versus where they are currently running.
Acknowledgements
Many thanks to SAS R&D, SAS Technical Support, Microsoft Azure, Azure NetApp Files, Sycomp, Veritas, and DDN experts for reviewing this post.
UPDATE to the Azure NetApp Files (ANF) section:
UPDATE to the Azure NetApp Files (ANF) section:
Please note as the blog states:
The(se following) architecture recommendations cover scale-up scenarios. Scale-out recommendations will follow later, pending validation.
I have updated this post with our recommendation to use the E64-32ds_v4 or E64-16ds_v4 instances for SAS to get optimal IO throughput with SAS 9.4, especially SAS Grid.
I also updated the External storage section.
Will keep updating the post as we well more about MS Azure and SAS.
Added information today about the sporadic NMI lockups that might hold processing while a thread waits for an available vCPU when using RHEL 7.x (3.10 kernel) with SAS compute nodes. Please review that new section to see how to overcome the issue.
Please note that the issue does not occur in RHEL 8.
Speaking of RHEL 8, please remember that SAS Viya 3.x is not supported on RHEL 8 at the current time.
Thank you for your post.
I have a rather basic question, so I'm not offended in the least if you redirect me elsewhere. Here's my situation and question:
Our company is currently running SAS 9.4 on a Windows 2016 (Standard) server. Is there a way for us to access data in Azure (format to be determined but more than likely HDINSIGHT) from our current environment or must we use a cloud-based SAS instance of SAS 9.4? I'm assuming here that a cloud-based SAS 9.4 instance in Azure is supported and that SAS Viya would not be required in order to have a cloud-based instance. If I'm wrong on any point, please correct me.
Thank you,
Jim
This post Access Microsoft Azure Storage & Big Data - SAS Support Communities seems to indicate that SAS/ACCESS to Hadoop can do what you are asking for. You could post a question as a reply to the above post to clarify.
Thank you.
We updated this section of the paper today:
When creating disk storage, you will be prompted for setting a Storage Caching value. Please set the following based on the type of files that will be used by these disks:
* this value was changed on 07DEC2021 after additional testing.
Are these values still relevant now? I am finding my Read I/O in Azure with my Striped Disks is very slow compared to my Write I/O on my Work and Data drives. Enabling the Read caching helps but I am apprehensive to leave it that way it the recommendation is not to use it.
Can you share with me what Azure instance you are using and what you mean by "Striped Disks"? What disks are being used for DATAand what are being used for WORK?
I just posted it to the Admin and Deployment thread.
Windows Azure server slow read I/O - SAS Support Communities
I have added information to the paper regarding using Veritas InfoScale with SAS. Details can be found here: InfoScale by Veritas: A shared file system to use ... - SAS Support Communities
Hello,
Has anyone deployed SAS Office Analytics 9.4 using Azure Virtual Desktop? Looking for specific feedback guidance on this Azure deployment option. Thank you.
The SAS Users Group for Administrators (SUGA) is open to all SAS administrators and architects who install, update, manage or maintain a SAS deployment.
SAS technical trainer Erin Winters shows you how to explore assets, create new data discovery agents, schedule data discovery agents, and much more.
Find more tutorials on the SAS Users YouTube channel.