BookmarkSubscribeRSS Feed

Best Practices When Moving SAS® to Microsoft Azure

Started ‎05-05-2021 by
Modified ‎05-17-2021 by
Views 6,234
Paper 1201-2021

AUTHOR Margaret Crevar, SAS

Abstract

The best practices that will be presented are lessons learned from SAS customers who have moved to Microsoft Azure. All of these practices are being placed in a single location to share with other SAS customers looking to move their SAS applications to Microsoft Azure.

 

watch the presentation

Watch Best Practices When Moving SAS® to Microsoft Azure as presented by the author on the SAS Users YouTube channel.

 

 

Introduction

DISCLAIMER:  Public clouds are enhancing their offerings on a regular basis.  Information shared in this presentation are the best practices at the time it is written and may not be applicable in 6+ months.

 

BEFORE YOU START:  You need to understand the compute resource (cores, memory, IO throughput and network bandwidth) needs of your SAS applications.  The more details you know about your SAS application, the better you can be guided MS Azure infrastructure needed.

 

After you understand the SAS applications’ compute resource needs, you need to define what the success criteria is for moving to the cloud.  Many say they want the SAS application to run faster than on-premises.  Please note that accomplishing this may be outside the budgetary constraints for our project.  So, if there is a limited budget, please understand you may have to adjust user expectations. 

 

Additional information needed before planning your move to the cloud is to understand where all the pieces of the SAS infrastructure will reside – data sources, client interfaces, third party tools.  If any of these are located outside the Azure Proximity Placement Group used for the SAS infrastructure, then performance degradations may occur.

 

WHAT INSTANCE TYPEs TO USE

To help you determine the instance type to use for your SAS application, let’s discuss the different SAS components. 

  • For SAS 9.4 these include mid-tier, metadata tier, compute tier, and if SAS Grid – shared file system nodes. 
  • For SAS Viya 3.5, these include CAS Controller/Workers, Microservices, Postgres/RabbitMQ, SAS Programming Run Time nodes.

This link brings you to the list of Azure instance types.  Read the description carefully to understand what compute resources are available with each instance family. 

 

For the Esv3 series, the instance type contains multiple processor models –Broadwell, Skylake or Cascade Lake.  Via the Azure portal, you are unable to select the chip set that will be used for the instance.  After an instance is instantiated, using the lscpu command will list the CPU Model Name for the system. 

 

For SAS Grid compute nodes and CAS Controller/Workers, we recommend these systems all be the same processor type. This will ensure you get consistent performance overall.  We strongly suggest investing in Azure Unified (a.k.a. Premier) Support.

 

Please remember that most public cloud instances list CPUs as virtual CPU(s). You need to divide the number of hyperthreaded vCPUs by 2 to get the number of physical cores.

MargaretC_0-1620232144812.png
  • Review “Temp storage (SSD) GB” and “Max cached and temp storage throughput: IOPS/MBps (cache size in GB)” to see the size and maximum IO throughput of the local, ephemeral disk.  For a Standard_E64ds_v4 instance, the maximum size of the internal SSD that could be used for temporary SAS file systems (SAS WORK/UTILLOC or CAS_DISK_CACHE) is 2400 GB and the maximum IO throughput is 3872 MB/sec (121 MB/sec/physical core).  If the size of this ephemeral storage is okay for your SAS WORK/UTILLOC, then you may not need to add Premium Storage to the instance.
  • Review “Max uncached disk throughput IOPS/MBps” - The last line in the above picture is for the Standard_E64ds_v4 instance (one of the most popular MS Azure instances that is being used for SAS compute systems), the maximum IO throughput (instance total, not per physical core) is 1200 MB per second.  For a 32 physical core system, this means 37.5 MB/sec/physical core IO bandwidth for all the data that will be stored on external Premium Storage (used for SAS WORK/UTILLOC).  If you need more IO throughput per physical core to the external Premium Storage, you can constrain the number of cores in the instance.  There will be more details on “constraining cores” later in this paper.
  • Review “Max NICs/Expected network bandwidth (Mbps)” to see what the maximum network egress bandwidth is.  For the Standard_ E64ds_v4 instance, the maximum network egress bandwidth is constrained to 30 Gigabit/sec. 

 

WHAT STORAGE TYPE TO USE

There are multiple file systems needed by SAS.  Gather a list of the ones currently used by your SAS application before reviewing what is needed in Azure.

 

Permanent SAS Data Storage:  This data storage persists between SAS sessions and after reboots of the Azure instances.  It can be local file systems or shared file systems (more details in the next section).  The link in this section will point you to the different Premium storage drives to use for locally attached file systems.  Like the instance types, there is a maximum IO throughput per Premium Disk. 

 

Temporary SAS Data Storage: This data storage does not persist between SAS sessions or after reboots of the Azure instance.  Ephemeral storage can be used for this storage.  This storage tends to need the fastest IO throughput.

 

What IO throughput is needed for each: You need to gather this from your existing SAS infrastructure and determine if storage in Azure can match what you have on-premises or if you need faster IO throughput.

To get consistent instance-to-instance IO and throughput, ensure all your instances are in the same Azure Proximity Placement Group.

 

As discussed on the Azure Instance section, there is a maximum IO throughput to attached storage.  If you need more IO throughput, then you can Utilize Constrained Cores with Azure instances to reduce the number of physical cores presented to the instance’s operating system, thereby increasing the IO bandwidth per physical core.

 

As a reminder, SAS temporary files and directories such as SAS WORK, SAS UTILLOC and CAS_DISK_CACHE should be placed on storage with the highest proven throughput possible.  Today that usually means Premium Storage or the instance’s local ephemeral SSD.

 

WHICH SHARED FILE SYSTEM TO USE

A shared file system is required if you are using SAS Grid.  As with everything, there are pros and cons for which shared file system to use.

 

TUNING GUIDELINES

These guidelines are from “lessons learned” by the initial SAS in the Azure cloud.

  1. To get consistent instance-to-instance IO and network throughput, ensure all your instances are in the same Azure Proximity Placement Group.
  2. Avoid sporadic NMI lockups that might hold processing while a thread waits for an available vCPU when using RHEL 7.x (3.10 kernel) with SAS compute nodes. There is a known issue in the iSCSI and SCSI drivers in this kernel which can cause CPU lock ups when under heavy IO load.
  3. To achieve optimal network bandwidth, Azure Accelerated Networking must be enabled.  Accelerated Networking is available on any Azure VM with 2 or more physical cores. 
  4. In addition to Accelerated Networking, SAS needs to be on an isolated cloud VNET, Resource Group, etc. This VNET should “share nothing” with other customer infrastructures.  The exception is placing the instances for your shared file system and RDBMSs dedicated to SAS on this VNET as well. 
  5. Azure Default VM Network MTU Size - Azure strongly recommends the default network MTU size of 1500 not be adjusted because Azure’s Virtual Network Stack will attempt to fragment a packet at 1400 bytes. To learn more, please review this “Azure and MTU” article.

More details on the above can be found in this blog post: Best Practices for Using Microsoft Azure with SAS® - SAS Support Communities

 

SAMPLE AZURE INSTANCE

To summarize the above, the following is a good example configuration for SAS 9.4 or SAS Viya 3.5 compute nodes. Standard_E64-32ds_v4: specs for this system:

  • Cascade Lake processor.
  • 16 physical cores (32 vCPUs)
  • 504 GB RAM
  • The internal 2,400 GB SSD drive can be used for SAS temporary file systems, but it cannot be increased in size. The throughput for this storage is 3,872 MB/sec which equates to 121 MB/sec/physical core.  SAS recommends at least 150 MB/sec/physical core.
  • 30 Gigabit egress network connectivity
  • For persistent storage, use six P30 Premium Disks striped together for a total of 6 TBs. If more disk space is needed, then add more P30 disks or larger Premium Disks. Remember the maximum IO bandwidth to the E64 instance is 1,200 MB /sec. With the constrained cores, this equates to 75 MB/sec/physical core for the Standard_E64-32ds_v4.  SAS recommends at least 100 MB/sec/physical core.

 

CONCLUSION

There are many resources, configuration settings and constraints to check within Azure to configure an instance to meet the needs of your SAS application. 

 

It is possible you may have to provision an instance with more physical cores (with or without a constrained core count) in order to get the IO throughput required by your SAS application. Likewise, you may also have to over-provision storage capacity to achieve the IO throughput needs for your SAS application.

 

As always, there are cost versus performance choices. These selections need to be based on your SLAs and business needs for SAS applications running in Azure.  Please remember things are always changing in the public cloud.

Lastly, please join the SAS Communities for “Administration” so you can be kept apprised of new information related to running SAS in the public cloud. 

 

References

Crevar, Margaret. 2021. Best Practices for Using Microsoft Azure with SAS® - SAS Support Communities

Microsoft Docs. 2021 VM sizes - Azure Virtual Machines

Microsoft Docs. 2021 Azure Premium Storage: Design for high performance - Azure Virtual Machines

Crevar, Margaret. 2021. EXAScaler Cloud by DDN: A shared file system to us... - SAS Support Communities

Crevar, Margaret. 2021. Sycomp Storage Fueled by IBM Spectrum Scale: A new... - SAS Support Communities

Crevar, Margaret. 2021. Azure NetApp Files: A shared file system to use wi... - SAS Support Communities

 

ACKNOWLEDGEMENTS

Many thanks to SAS R&D, SAS Technical Support, Microsoft Azure, Azure NetApp Files, Sycomp and DDN experts for reviewing this post. 

Recommended Reading

Marson, Barry. 2019. Optimizing SAS on RHEL 6 and 7 (redhat.com)

Crevar, Margaret. 2020. Important Performance Considerations when Moving SAS to the Public Cloud

 

Contact Information

Your comments and questions are valued and encouraged. Contact the author at:

Margaret Crevar

SAS

Margaret.Crevar@sas.com

www.sas.com

 

SAS and all other SAS Institute Inc. product or service names are registered trademarks or trademarks of SAS Institute Inc. in the USA and other countries. ® indicates USA registration.

Other brand and product names are trademarks of their respective companies.

 

Comments

Thank you @MargaretC , as specific and detailed as usual, you deliver.

 

By the way, the question from yesterday, about the sizing for Viya 4, was posted by me.

 

You asked for my email for you to contact me, but chat was closed. In any case, I believe you have my mail address. If not, please let me know and I'll be happy to drop you a message.

Please drop me an email.
Thank you.

Thank you, it is on its way to you

Version history
Last update:
‎05-17-2021 05:28 PM
Updated by:
Contributors

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Article Tags