BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.

Important Performance Considerations When Moving

SAS Applications to the Amazon Cloud

Last Updated: 07JUL2018

 

Executive Summary

Any architecture that is chosen by a SAS customer to run their SAS applications requires:

 

  • a good understanding of all layers and components of the infrastructure,
  • an administrator to configure and manage the infrastructure
  • the ability to meet SAS’ requirements not just to run the software, but to also allow it to perform well.

 

UPDATE:  Margaret presented a SAS Global Forum 2018 with updated information on this subject.  Please refer to the information in this paper over what is in this post. Important Performance Considerations When Moving SAS to a Public Cloud (https://www.sas.com/content/dam/SAS/support/en/sas-global-forum-proceedings/2018/1866-2018.pdf)

 

 

This paper will talk about important performance considerations for SAS 9 (both SAS Foundation and SAS Grid) and SAS Viya in the Amazon Cloud.

 

Many companies are deciding to push their data centers from on-premises to a public cloud, including their SAS applications.  Because of this decision, SAS customers are asking if SAS can run in a public cloud.  The short answer to that question is yes, SAS has been tested without errors in many of the public clouds.  The more in-depth answer to that question is that they need to understand what their computing resource needs are before committing to move a SAS application that is performing ideally on-premises to any public cloud. 

 

Performing a detailed assessment of the computing resources needed to support a customer’s SAS applications is required prior to their decision to go production in the public cloud.  Existing customers, should use the information from the assessment to setup instances and storage as a proof of concept in the public cloud, before making the decision to move to the public cloud.  It is important that they work closely with their IT team to examine all compute resources, including IO throughput, number of cores, amount of memory, and amount of physical disk space.  It is also important to see how close their existing compute infrastructure is to meeting their SAS users’ current SLAs.  The gathering of compute resource data can be done by monitoring the existing computer systems in their SAS infrastructure with external hardware monitor tools like nmon for Linux and AIX and perfmon for Windows.   For new customers, this decision is more difficult since they and we may not know exactly how they will be using SAS and what their workload demands will be.

 

Once the above information is gathered, it should be used to determine what AWS EC2 instance is the best fit for the various SAS tiers.  Information on the different types of AWS EC2 instances that are a good fit for SAS are listed below.  One thing to mention that is not listed in the table below is that the maximum IO throughput you can get between the AWS EC2 instance and EBS storage is 50 MB/second/core.  This is why using AWS EC2 instances without ephemeral storage is not a good idea since your customer would have to put SAS WORK/UTILLOC on this slow storage.

 

 

 

AWS Instance type

I3

I2

R4

R3

D2

Description

High I/O

High I/O

Memory    Optimized

Memory    Optimized

Dense-Storage

Intel Processor

E5-2686 v4              

    (Broadwell)

E5-2670 v2         

    (Ivy Bridge)

E5-2686 v4         

        (Broadwell)

E5-2670 v2        

     (Ivy Bridge)

E5-2676v3 (Haswell) 

Adequate Memory

yes

yes

yes

Yes

yes

Network

up to 20 Gbit

up to 10 Gbit

up to 20 Gbit

up to 10 Gbit

up to 10 Gbit

Internal Storage

NVMe SSDs

SSDs

No (EBS-only)

SSD (limited)

HDD

Other

Only supported with RHEL 7.3 and you need to run "yum update kernel" to get the NVMe fix.

 

 

 

 

 

It must be determined that the instance has enough physical cores (vCPUs/2 – this is because the vCPUs are actually hyper threads).   Also, the AWS-specific SAS SETINIT must be applied so that the SAS deployment can access the number of physical cores that have been licensed.  SAS runs on physical cores, but not well on hyper threads because there are floating point unit sharing issues.  In addition to provisioning cores (dividing the number of vCPUs by two), it must be determined that there is enough memory AND IO throughput forthe SAS applications.  Here are some considerations for various SAS applications that should be well understand before choosing AWS EC2 instances:

 

  • SAS WORK and UTILLOC file systems need the most IO throughput, and the requirement is typically for much more IO throughput than can be achieved with EBS storage. EBS storage has a maximum of 50 MB/sec/core limitation.   SAS WORK and UTILLOC file systems require a minimum 100 – 150 MB/sec/core depending on which SAS PROCEDURES are used.   Therefore they should use ephemeral storage internal to the instance described in more detail below.
  • SAS Viya applications have a requirement for robust IO throughput, as well as the need for large amounts of memory. This is because CAS will page data pages to the storage device if there is not enough physical RAM to hold all the data files in memory.  Slow IO throughput will greatly impact the performance of SAS Viya.  Please note that SAS Viya is only available on Red Hat Enterprise Linux (RHEL) 6.7 or higher and 7.1 or higher and same releases of Oracle Enterprise Linux (OEL).
  • There are limited storage options available at Amazon. Here is a list of what works and doesn’t work. 
    • Ephemeral storage consists of disks internal to the AWS EC2 instance. It must be configured as a local file system (RHEL – XFS or EXT4).  All data on this storage will disappear with a reboot or restart of the AWS EC2 instance.  We strongly suggest that you stripe all the Ephemeral disks together in a RAID0.
    • EBS storage is on NAS storage in the AWS infrastructure. Data on it will persist after a reboot or restart, but there is limited IO throughput to this storage.  We strongly suggest that you use at least 4 (preferably 😎 ST1 EBI volumes striped together.
    • EFS storage is available from AWS. Unfortunately, this storage does not have the file locks required by SAS, so it cannot be used for any SAS files or binaries.
    • Intel Cloud Edition for Lustre File System storage is the only shared file system in Amazon that has been tested with SAS Grid Manager; however, the future of Luster is uncertain as Intel has contributed the Lustre code-base to the open source community, and no longer provides Intel-branded releases. Intel will provide support of Lustre for the next two years.
  • IO throughput is very crucial for SAS Foundation and SAS Grid deployments. Be sure to choose AWS EC2 instances that are designed for large sequential IO and not IOPS.

 

Currently, the best AWS EC2 instance to use with SAS Foundation, SAS Grid and SAS Viya is the I2 instance family.  Please note the cost of I2 instances (http://www.ec2instances.info/), and recognize that some customers may want to go with a cheaper AWS EC2 instances.  Before you suggest that they go to a different AWS EC2 instance, please make sure it can deliver equivalent IO throughput to the ephemeral storage for SAS WORK or Viya CASCACHE as the AWS EC2 I2 instances.

 

While we are talking about Amazon, there are several potentially inaccurate precepts regarding how SAS will run in AWS.  Let’s discuss several of the more popular ones:

 

  • It will be cheaper to run SAS in the public cloud. If your customer only stands up the bare minimum number of cores and physical disk space to save on money, there is a strong possibility that they will not be able to obtain the throughput performance needed to maintain happy SAS users. More expensive and increased core counts for EC2 instances may be required to provision adequate IO bandwidth, depending on SLA agreements with their SAS users.
  • SAS will run faster in the public cloud than on-premises. Public clouds have not built their infrastructure, particularly network interconnects, to support high volumes of large sequential IO. In order to get the IO throughput needed, SAS applications will need to be spread across multiple EC2 instances that are connected to a shared file system that is designed to spread its data locations across multiple volumes – like Lustre does.  Adequate IO throughput can be achieved with this infrastructure, but it can come with a larger price tag.
  • Administrators will not be needed with a move to the public cloud. It is true that public cloud does away with management of physical infrastructure; however, this “void” is replaced with management of the cloud architecture. Cloud Architects understand how to interact with the cloud infrastructure (IaaS) provider, including: provisioning infrastructure services, integrating cloud services with on-premises systems, and securing the applications, data, and systems running on that IaaS. It is also a myth that HA and DR are provided for free in the cloud. Implementing High Availability and Disaster Recovery for critical systems are additional responsibilities of a Cloud Architect. Administrators are also needed for the instance’s host operating system and relational databases.  For example, your customer needs an operating system administrator to configure the file systems needed for use with SAS as well as to tune the operating system for ideal performance.  Along with initial setup of the file systems, additional scripting needs to be done to both reconnect their permanent SAS data files and recreate their ephemeral storage after a restart/reboot of an instance. 
  • SAS can take advantage of the bursting features of AWS.  Deploying your SAS applications in Amazon does not mean that they are elastic and can leverage the AWS auto scaling capabilities.   You should refer to SAS product documentation to determine if bursting is supported and how it is implemented.

 

Bottom line:

SAS customers may have to stand up more computer resources in the public cloud than their EEC sizing suggests and/or are in their on-premises system - more cores, more instances, and more physical disk space - in order to meet their SAS applications needs, especially from an IO throughput perspective.

 

References:

https://support.sas.com/resources/papers/Implementing-SAS9-4-Software-Cloud-Infrastructures.pdf

 

 Contacts:

  • Margaret Crevar

Margaret.Crevar@sas.com

+1 919.531.7095

 

  • Ande Stelk

Ande.Stelk@sas.com

+1 919.531.9984

1 ACCEPTED SOLUTION

Accepted Solutions
MargaretC
SAS Employee

Just learned from Red Hat that a driver has been added to RHEL 7.3 so that you can use the AWS EC2 I3 instances now.  

 

For now, you spin up an I3 instance and use the RHEL 7.3 AMI.  Once this is up you will need to issue this command "yum update kernel".

 

There is no support for the ENA  (enhanced network) until RHEL 7.4, due out later this year.

 

 

View solution in original post

10 REPLIES 10
MargaretC
SAS Employee

Just learned from Red Hat that a driver has been added to RHEL 7.3 so that you can use the AWS EC2 I3 instances now.  

 

For now, you spin up an I3 instance and use the RHEL 7.3 AMI.  Once this is up you will need to issue this command "yum update kernel".

 

There is no support for the ENA  (enhanced network) until RHEL 7.4, due out later this year.

 

 

bconneen
Fluorite | Level 6

We've been running our SAS install on AWS for about a year.

 

i2.8xlarge.  We striped the 8 ephemeral disks as RAID 0 for SAWORK.

 

We are currently using Provisioned IOPS (io1) for our user data, single drive not striped.

 

A couple of questions based on the very helpful post (wish we had it 1 year ago)!!! 🙂


It seems you are recommending we move to a set of 4 (or even 😎 st1 drives for user data.  Is the performance stronger?  Any concerns about lack of IOPS?  What kind of stripe configuration do you recommend?

 

Do you recommend running fstrim on the Ephemeral disks on any interval to mitigate any concerns there?

 

What is the downside to not using the "AWS-specific SETINT"?  What exactly is the "AWS-specific SETINT"?

 

Appreciate your thoughts and consideration.

 

Brian Conneen

brian@marlettefunding.com

 

 

MargaretC
SAS Employee

The type of IO SAS does is large sequnetial IO, not random IOs (IOPS).  Because of this, you need lots of disk heads to support the IO needs, even to your permanent SAS data files.  Because of this, Amazon has recommended that if you are using EBI storage, you use ST1 type and that you create multiple EBI ST1 volumes that you stripe together.

 

Not sure what FSTRIM is.  I will check with Amazon to find out their thoughts.

 

No downside to use an "AWS-specific SETINIT" with SAS.  Just please make sure SAS Technical Support knows you are running SAS on AWS if you call in with any issues.

 

Cheers,

Margaret

bconneen
Fluorite | Level 6
Margaret,

Thanks for the prompt reply. Just want to double check that you still feel that the Instance (Ephemeral) storage in a RAID0 configuration is best for SASWORK.

I'm going to do some testing with an i3.16xlarge today. It seems the closest to the i2.8xlarge in terms of available bandwidth in Instance storage.

Thanks!
MargaretC
SAS Employee

RAID0 for SAS WORK and SAS UTILLOC on ephemeral storage is fine.  It is what we recommmend.

 

Please note that with AWS EC2 I3 instances there are issues with connecting to the NVMe SSD drives with RHEL.  This has been fixed with RHEL 7.3 only.  You will need to use the existing "community" RHEL 7.3 AMI.  And once it has been loaded, you will need to apply a security update by issuing the following command yum update command to apply the patches needed to enable the NVMe SSDs to work.  

Please let us know if you have any questions on the above.

bconneen
Fluorite | Level 6

Update on my comparisons between an i2.8xlarge and an i3.16xlarge.

 

Price: $25.55 a month more for the i3.16xlarge. Linux Reserved on 1 year agreement.

Total Capacity:  i2.8xlarge = 5.82TB   i3.16xlarge = 13.82 TB

 

Took both box types, put the 8 instance disks into a LVM Striped Configuration formatted with XFS.

 

Used bonnie++ to benchmark them.

 

i2.8xlarge

Sequential Read: 1936.4 MB / sec

Sequential Write: 1153.6 MB / sec

Sequential ReWrite: 816.2 MB /sec

 

i3.16xlarge

Sequential Read: 2578.9 MB / sec  (33% improvement)

Sequential Write: 1166.5 MB / sec  (1% improvement)

Sequential ReWrite: 978.1 MB /sec (20% improvment)

 

 

MargaretC
SAS Employee

I am not familiar with bonnie++.  Does it do direct IO or got through the operating system's file cache?  And did you set the page size to be 64K, which is what SAS 9.4 uses by default?

 

SAS has written a script that mimics how SAS does IO.  I would love for you to run it and let us know what the results on the two instances are.  You can find the tool here http://support.sas.com/kb/59/680.html 

 

Margaret

bconneen
Fluorite | Level 6

I ran rhel_iotest.sh on an i3.16xlarge.

 

8 Ephemeral Drives Striped using LVM as /saswork

 

RESULTS
-------
INVOCATION: rhel_iotest -t /saswork

TARGET DETAILS
directory: /saswork
df -k: /dev/mapper/saswork-root 14841653248 33840 14841619408 1% /saswork
mount point: /dev/mapper/saswork-root on /saswork type xfs (rw,relatime,seclabel,attr2,inode64,logbsize=256k,sunit=512,swidth=4096,noquota)
filesize: 390.08 gigabytes

STATISTICS
read throughput rate: 384.15 megabytes/second per physical core
write throughput rate: 138.46 megabytes/second per physical core
-----------------------------


********* ALL ERRORS & WARNINGS *********
<<WARNING>> insufficient free space in [/saswork] for FULL test.
<<WARNING>> - smaller stack size and # of blocks will be used.
*****************************************

 

 

**Note: Not sure why there is an Error message about space, the /saswork drive started the test with 14TB of space free. Perhaps the amount of freespace overflowed some sort of integer value in the script.

 

----------------------------------------------------------------------------------------------

 

3 X 12.5TB Drives in LVM Stripe for /users

 

RESULTS
-------
INVOCATION: rhel_iotest -t /users

TARGET DETAILS
directory: /users
df -k: /dev/mapper/users-root 40263219200 34032 40263185168 1% /users
mount point: /dev/mapper/users-root on /users type xfs (rw,relatime,seclabel,attr2,inode64,logbsize=256k,sunit=512,swidth=1536,noquota)
filesize: 480.09 gigabytes

STATISTICS
read throughput rate: 39.68 megabytes/second per physical core
write throughput rate: 49.73 megabytes/second per physical core

gafrer
SAS Employee

Margaret, were you using EBS Provisioned IOPS SSD or EBS General Purpose SSD for your second test when you striped 3 x 12.5TB volumes?

 

My results for i3.8xlarge using 2 of the 4 available Ephemeral NVME drives I striped and tested with rhel_iotest.sh.

 

RESULTS

-------

INVOCATION:  rhel_iotest -t /saswork

 

TARGET DETAILS

directory:    /saswork

df -k:        /dev/md0       3708852856 134243856 3574609000   4% /saswork

mount point:  /dev/md0 on /saswork type xfs (rw,noatime,nodiratime,seclabel,attr2,nobarrier,inode64,logbufs=8,sunit=1024,swidth=2048,noquota)

filesize:     182.95 gigabytes

 

STATISTICS

read throughput rate:   243.40 megabytes/second per physical core

write throughput rate:  95.27 megabytes/second per physical core

 

 

********* ALL ERRORS & WARNINGS *********

<<WARNING>>  insufficient free space in [/saswork] for FULL test.

<<WARNING>>       - smaller stack size and # of blocks will be used.

*****************************************

MargaretC
SAS Employee

Please note, as of 08AUG2017 the preferred AWS EC2 instance for SAS Foundation, SAS Viya and SAS Grid compute nodes is the I3 instance running RHEL 7.4.  RHEL 7.4 has added the drivers required to support the NVMe drives in the I3 instances along with the Enhanced Network Adaptors.