Important Performance Considerations When Moving
SAS Applications to the Amazon Cloud
Last Updated: 07JUL2018
Executive Summary
Any architecture that is chosen by a SAS customer to run their SAS applications requires:
UPDATE: Margaret presented a SAS Global Forum 2018 with updated information on this subject. Please refer to the information in this paper over what is in this post. Important Performance Considerations When Moving SAS to a Public Cloud (https://www.sas.com/content/dam/SAS/support/en/sas-global-forum-proceedings/2018/1866-2018.pdf)
This paper will talk about important performance considerations for SAS 9 (both SAS Foundation and SAS Grid) and SAS Viya in the Amazon Cloud.
Many companies are deciding to push their data centers from on-premises to a public cloud, including their SAS applications. Because of this decision, SAS customers are asking if SAS can run in a public cloud. The short answer to that question is yes, SAS has been tested without errors in many of the public clouds. The more in-depth answer to that question is that they need to understand what their computing resource needs are before committing to move a SAS application that is performing ideally on-premises to any public cloud.
Performing a detailed assessment of the computing resources needed to support a customer’s SAS applications is required prior to their decision to go production in the public cloud. Existing customers, should use the information from the assessment to setup instances and storage as a proof of concept in the public cloud, before making the decision to move to the public cloud. It is important that they work closely with their IT team to examine all compute resources, including IO throughput, number of cores, amount of memory, and amount of physical disk space. It is also important to see how close their existing compute infrastructure is to meeting their SAS users’ current SLAs. The gathering of compute resource data can be done by monitoring the existing computer systems in their SAS infrastructure with external hardware monitor tools like nmon for Linux and AIX and perfmon for Windows. For new customers, this decision is more difficult since they and we may not know exactly how they will be using SAS and what their workload demands will be.
Once the above information is gathered, it should be used to determine what AWS EC2 instance is the best fit for the various SAS tiers. Information on the different types of AWS EC2 instances that are a good fit for SAS are listed below. One thing to mention that is not listed in the table below is that the maximum IO throughput you can get between the AWS EC2 instance and EBS storage is 50 MB/second/core. This is why using AWS EC2 instances without ephemeral storage is not a good idea since your customer would have to put SAS WORK/UTILLOC on this slow storage.
AWS Instance type |
I3 |
I2 |
R4 |
R3 |
D2 |
Description |
High I/O |
High I/O |
Memory Optimized |
Memory Optimized |
Dense-Storage |
Intel Processor |
E5-2686 v4 (Broadwell) |
E5-2670 v2 (Ivy Bridge) |
E5-2686 v4 (Broadwell) |
E5-2670 v2 (Ivy Bridge) |
E5-2676v3 (Haswell) |
Adequate Memory |
yes |
yes |
yes |
Yes |
yes |
Network |
up to 20 Gbit |
up to 10 Gbit |
up to 20 Gbit |
up to 10 Gbit |
up to 10 Gbit |
Internal Storage |
NVMe SSDs |
SSDs |
No (EBS-only) |
SSD (limited) |
HDD |
Other |
Only supported with RHEL 7.3 and you need to run "yum update kernel" to get the NVMe fix. |
|
|
|
|
It must be determined that the instance has enough physical cores (vCPUs/2 – this is because the vCPUs are actually hyper threads). Also, the AWS-specific SAS SETINIT must be applied so that the SAS deployment can access the number of physical cores that have been licensed. SAS runs on physical cores, but not well on hyper threads because there are floating point unit sharing issues. In addition to provisioning cores (dividing the number of vCPUs by two), it must be determined that there is enough memory AND IO throughput forthe SAS applications. Here are some considerations for various SAS applications that should be well understand before choosing AWS EC2 instances:
Currently, the best AWS EC2 instance to use with SAS Foundation, SAS Grid and SAS Viya is the I2 instance family. Please note the cost of I2 instances (http://www.ec2instances.info/), and recognize that some customers may want to go with a cheaper AWS EC2 instances. Before you suggest that they go to a different AWS EC2 instance, please make sure it can deliver equivalent IO throughput to the ephemeral storage for SAS WORK or Viya CASCACHE as the AWS EC2 I2 instances.
While we are talking about Amazon, there are several potentially inaccurate precepts regarding how SAS will run in AWS. Let’s discuss several of the more popular ones:
Bottom line:
SAS customers may have to stand up more computer resources in the public cloud than their EEC sizing suggests and/or are in their on-premises system - more cores, more instances, and more physical disk space - in order to meet their SAS applications needs, especially from an IO throughput perspective.
References:
https://support.sas.com/resources/papers/Implementing-SAS9-4-Software-Cloud-Infrastructures.pdf
Contacts:
+1 919.531.7095
+1 919.531.9984
Just learned from Red Hat that a driver has been added to RHEL 7.3 so that you can use the AWS EC2 I3 instances now.
For now, you spin up an I3 instance and use the RHEL 7.3 AMI. Once this is up you will need to issue this command "yum update kernel".
There is no support for the ENA (enhanced network) until RHEL 7.4, due out later this year.
Just learned from Red Hat that a driver has been added to RHEL 7.3 so that you can use the AWS EC2 I3 instances now.
For now, you spin up an I3 instance and use the RHEL 7.3 AMI. Once this is up you will need to issue this command "yum update kernel".
There is no support for the ENA (enhanced network) until RHEL 7.4, due out later this year.
We've been running our SAS install on AWS for about a year.
i2.8xlarge. We striped the 8 ephemeral disks as RAID 0 for SAWORK.
We are currently using Provisioned IOPS (io1) for our user data, single drive not striped.
A couple of questions based on the very helpful post (wish we had it 1 year ago)!!! 🙂
It seems you are recommending we move to a set of 4 (or even 😎 st1 drives for user data. Is the performance stronger? Any concerns about lack of IOPS? What kind of stripe configuration do you recommend?
Do you recommend running fstrim on the Ephemeral disks on any interval to mitigate any concerns there?
What is the downside to not using the "AWS-specific SETINT"? What exactly is the "AWS-specific SETINT"?
Appreciate your thoughts and consideration.
Brian Conneen
brian@marlettefunding.com
The type of IO SAS does is large sequnetial IO, not random IOs (IOPS). Because of this, you need lots of disk heads to support the IO needs, even to your permanent SAS data files. Because of this, Amazon has recommended that if you are using EBI storage, you use ST1 type and that you create multiple EBI ST1 volumes that you stripe together.
Not sure what FSTRIM is. I will check with Amazon to find out their thoughts.
No downside to use an "AWS-specific SETINIT" with SAS. Just please make sure SAS Technical Support knows you are running SAS on AWS if you call in with any issues.
Cheers,
Margaret
RAID0 for SAS WORK and SAS UTILLOC on ephemeral storage is fine. It is what we recommmend.
Please note that with AWS EC2 I3 instances there are issues with connecting to the NVMe SSD drives with RHEL. This has been fixed with RHEL 7.3 only. You will need to use the existing "community" RHEL 7.3 AMI. And once it has been loaded, you will need to apply a security update by issuing the following command yum update command to apply the patches needed to enable the NVMe SSDs to work.
Please let us know if you have any questions on the above.
Update on my comparisons between an i2.8xlarge and an i3.16xlarge.
Price: $25.55 a month more for the i3.16xlarge. Linux Reserved on 1 year agreement.
Total Capacity: i2.8xlarge = 5.82TB i3.16xlarge = 13.82 TB
Took both box types, put the 8 instance disks into a LVM Striped Configuration formatted with XFS.
Used bonnie++ to benchmark them.
i2.8xlarge
Sequential Read: 1936.4 MB / sec
Sequential Write: 1153.6 MB / sec
Sequential ReWrite: 816.2 MB /sec
i3.16xlarge
Sequential Read: 2578.9 MB / sec (33% improvement)
Sequential Write: 1166.5 MB / sec (1% improvement)
Sequential ReWrite: 978.1 MB /sec (20% improvment)
I am not familiar with bonnie++. Does it do direct IO or got through the operating system's file cache? And did you set the page size to be 64K, which is what SAS 9.4 uses by default?
SAS has written a script that mimics how SAS does IO. I would love for you to run it and let us know what the results on the two instances are. You can find the tool here http://support.sas.com/kb/59/680.html
Margaret
I ran rhel_iotest.sh on an i3.16xlarge.
8 Ephemeral Drives Striped using LVM as /saswork
RESULTS
-------
INVOCATION: rhel_iotest -t /saswork
TARGET DETAILS
directory: /saswork
df -k: /dev/mapper/saswork-root 14841653248 33840 14841619408 1% /saswork
mount point: /dev/mapper/saswork-root on /saswork type xfs (rw,relatime,seclabel,attr2,inode64,logbsize=256k,sunit=512,swidth=4096,noquota)
filesize: 390.08 gigabytes
STATISTICS
read throughput rate: 384.15 megabytes/second per physical core
write throughput rate: 138.46 megabytes/second per physical core
-----------------------------
********* ALL ERRORS & WARNINGS *********
<<WARNING>> insufficient free space in [/saswork] for FULL test.
<<WARNING>> - smaller stack size and # of blocks will be used.
*****************************************
**Note: Not sure why there is an Error message about space, the /saswork drive started the test with 14TB of space free. Perhaps the amount of freespace overflowed some sort of integer value in the script.
----------------------------------------------------------------------------------------------
3 X 12.5TB Drives in LVM Stripe for /users
RESULTS
-------
INVOCATION: rhel_iotest -t /users
TARGET DETAILS
directory: /users
df -k: /dev/mapper/users-root 40263219200 34032 40263185168 1% /users
mount point: /dev/mapper/users-root on /users type xfs (rw,relatime,seclabel,attr2,inode64,logbsize=256k,sunit=512,swidth=1536,noquota)
filesize: 480.09 gigabytes
STATISTICS
read throughput rate: 39.68 megabytes/second per physical core
write throughput rate: 49.73 megabytes/second per physical core
Margaret, were you using EBS Provisioned IOPS SSD or EBS General Purpose SSD for your second test when you striped 3 x 12.5TB volumes?
My results for i3.8xlarge using 2 of the 4 available Ephemeral NVME drives I striped and tested with rhel_iotest.sh.
RESULTS
-------
INVOCATION: rhel_iotest -t /saswork
TARGET DETAILS
directory: /saswork
df -k: /dev/md0 3708852856 134243856 3574609000 4% /saswork
mount point: /dev/md0 on /saswork type xfs (rw,noatime,nodiratime,seclabel,attr2,nobarrier,inode64,logbufs=8,sunit=1024,swidth=2048,noquota)
filesize: 182.95 gigabytes
STATISTICS
read throughput rate: 243.40 megabytes/second per physical core
write throughput rate: 95.27 megabytes/second per physical core
********* ALL ERRORS & WARNINGS *********
<<WARNING>> insufficient free space in [/saswork] for FULL test.
<<WARNING>> - smaller stack size and # of blocks will be used.
*****************************************
Please note, as of 08AUG2017 the preferred AWS EC2 instance for SAS Foundation, SAS Viya and SAS Grid compute nodes is the I3 instance running RHEL 7.4. RHEL 7.4 has added the drivers required to support the NVMe drives in the I3 instances along with the Enhanced Network Adaptors.
The SAS Users Group for Administrators (SUGA) is open to all SAS administrators and architects who install, update, manage or maintain a SAS deployment.