AUTHOR Margaret Crevar, SAS
The best practices that will be presented are lessons learned from SAS customers who have moved to Microsoft Azure. All of these practices are being placed in a single location to share with other SAS customers looking to move their SAS applications to Microsoft Azure.
Watch Best Practices When Moving SAS® to Microsoft Azure as presented by the author on the SAS Users YouTube channel.
DISCLAIMER: Public clouds are enhancing their offerings on a regular basis. Information shared in this presentation are the best practices at the time it is written and may not be applicable in 6+ months.
BEFORE YOU START: You need to understand the compute resource (cores, memory, IO throughput and network bandwidth) needs of your SAS applications. The more details you know about your SAS application, the better you can be guided MS Azure infrastructure needed.
After you understand the SAS applications’ compute resource needs, you need to define what the success criteria is for moving to the cloud. Many say they want the SAS application to run faster than on-premises. Please note that accomplishing this may be outside the budgetary constraints for our project. So, if there is a limited budget, please understand you may have to adjust user expectations.
Additional information needed before planning your move to the cloud is to understand where all the pieces of the SAS infrastructure will reside – data sources, client interfaces, third party tools. If any of these are located outside the Azure Proximity Placement Group used for the SAS infrastructure, then performance degradations may occur.
To help you determine the instance type to use for your SAS application, let’s discuss the different SAS components.
This link brings you to the list of Azure instance types. Read the description carefully to understand what compute resources are available with each instance family.
For the Esv3 series, the instance type contains multiple processor models –Broadwell, Skylake or Cascade Lake. Via the Azure portal, you are unable to select the chip set that will be used for the instance. After an instance is instantiated, using the lscpu command will list the CPU Model Name for the system.
For SAS Grid compute nodes and CAS Controller/Workers, we recommend these systems all be the same processor type. This will ensure you get consistent performance overall. We strongly suggest investing in Azure Unified (a.k.a. Premier) Support.
Please remember that most public cloud instances list CPUs as virtual CPU(s). You need to divide the number of hyperthreaded vCPUs by 2 to get the number of physical cores.
There are multiple file systems needed by SAS. Gather a list of the ones currently used by your SAS application before reviewing what is needed in Azure.
Permanent SAS Data Storage: This data storage persists between SAS sessions and after reboots of the Azure instances. It can be local file systems or shared file systems (more details in the next section). The link in this section will point you to the different Premium storage drives to use for locally attached file systems. Like the instance types, there is a maximum IO throughput per Premium Disk.
Temporary SAS Data Storage: This data storage does not persist between SAS sessions or after reboots of the Azure instance. Ephemeral storage can be used for this storage. This storage tends to need the fastest IO throughput.
What IO throughput is needed for each: You need to gather this from your existing SAS infrastructure and determine if storage in Azure can match what you have on-premises or if you need faster IO throughput.
To get consistent instance-to-instance IO and throughput, ensure all your instances are in the same Azure Proximity Placement Group.
As discussed on the Azure Instance section, there is a maximum IO throughput to attached storage. If you need more IO throughput, then you can Utilize Constrained Cores with Azure instances to reduce the number of physical cores presented to the instance’s operating system, thereby increasing the IO bandwidth per physical core.
As a reminder, SAS temporary files and directories such as SAS WORK, SAS UTILLOC and CAS_DISK_CACHE should be placed on storage with the highest proven throughput possible. Today that usually means Premium Storage or the instance’s local ephemeral SSD.
A shared file system is required if you are using SAS Grid. As with everything, there are pros and cons for which shared file system to use.
These guidelines are from “lessons learned” by the initial SAS in the Azure cloud.
More details on the above can be found in this blog post: Best Practices for Using Microsoft Azure with SAS® - SAS Support Communities
To summarize the above, the following is a good example configuration for SAS 9.4 or SAS Viya 3.5 compute nodes. Standard_E64-32ds_v4: specs for this system:
There are many resources, configuration settings and constraints to check within Azure to configure an instance to meet the needs of your SAS application.
It is possible you may have to provision an instance with more physical cores (with or without a constrained core count) in order to get the IO throughput required by your SAS application. Likewise, you may also have to over-provision storage capacity to achieve the IO throughput needs for your SAS application.
As always, there are cost versus performance choices. These selections need to be based on your SLAs and business needs for SAS applications running in Azure. Please remember things are always changing in the public cloud.
Lastly, please join the SAS Communities for “Administration” so you can be kept apprised of new information related to running SAS in the public cloud.
Crevar, Margaret. 2021. Best Practices for Using Microsoft Azure with SAS® - SAS Support Communities
Microsoft Docs. 2021 VM sizes - Azure Virtual Machines
Microsoft Docs. 2021 Azure Premium Storage: Design for high performance - Azure Virtual Machines
Crevar, Margaret. 2021. EXAScaler Cloud by DDN: A shared file system to us... - SAS Support Communities
Crevar, Margaret. 2021. Sycomp Storage Fueled by IBM Spectrum Scale: A new... - SAS Support Communities
Crevar, Margaret. 2021. Azure NetApp Files: A shared file system to use wi... - SAS Support Communities
Many thanks to SAS R&D, SAS Technical Support, Microsoft Azure, Azure NetApp Files, Sycomp and DDN experts for reviewing this post.
Marson, Barry. 2019. Optimizing SAS on RHEL 6 and 7 (redhat.com)
Crevar, Margaret. 2020. Important Performance Considerations when Moving SAS to the Public Cloud
Your comments and questions are valued and encouraged. Contact the author at:
Margaret Crevar
SAS
Margaret.Crevar@sas.com
SAS and all other SAS Institute Inc. product or service names are registered trademarks or trademarks of SAS Institute Inc. in the USA and other countries. ® indicates USA registration.
Other brand and product names are trademarks of their respective companies.
Thank you @MargaretC , as specific and detailed as usual, you deliver.
By the way, the question from yesterday, about the sizing for Viya 4, was posted by me.
You asked for my email for you to contact me, but chat was closed. In any case, I believe you have my mail address. If not, please let me know and I'll be happy to drop you a message.
Thank you, it is on its way to you
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Ready to level-up your skills? Choose your own adventure.