Serverless Operations in AWS, part 1: Prep and Build

1 Like

At SAS, the Global Enablement & Learning team (GEL) recently had the opportunity to make some important changes to our automated processes that manage resources hosted in AWS. Instead of continuing to run those processes on an internal Jenkins server, we've moved them to run in AWS itself.

As far as the code we moved, it didn't require any changes to execute in the new environment - which was very cool and somewhat surprising. Setting up the environment in AWS was a bit of a learning curve, though. But now that the trail has been blazed, it's pretty straight-forward to share with you.

In this multi-part series, we'll begin here by looking at the steps to prepare and build an application to run in AWS. Then we'll explain how to deploy and run your project in AWS. Later in the series, we'll look at the considerations to schedule and monitor your jobs.

And finally, check out @FrederikV 's post, Running SAS Scoring Runtime Containers through AWS Fargate, where he takes the principals explained in this series and applies it to running analytic models produced by SAS Viya using a serverless approach in AWS.

Objective

The GEL team's automated process for managing resources hosted in AWS is pretty lightweight. It's basically a group of shell scripts that rely on the AWS CLI as well as an external project written primarily in Go. Relatively speaking, it consumes minimal CPU and RAM, needs no persistent disk, and its runtime is usually between 1 and 10 minutes. It's not especially time-critical either, so we can be flexible on resource availability as well (helping reduce costs).

With that in mind, we elected to go with AWS Fargate which offers serverless, pay-as-you-go compute resources. It expects to run your code from a container and automatically handles the compute resources needed.

AWS offers tremendous variety and flexibility to get things done. While we'll look at one approach here, there are likely a multitude of other options that you might consider for your projects along the way.

Serverless architecture in AWS

AWS Batch is a fully managed batch computing service to execute containerized workload in the AWS compute offerings, like AWS Fargate (and EC2, EKS, ECS, and more). For our purposes, the containers will be stored as repositories in the Amazon Elastic Container Registry.

With Batch, we can create job definitions to run in the desired compute environment. We can then reference those job definitions to submit ad-hoc jobs on the fly or on a schedule for repeated execution.

Amazon EventBridge extends AWS Batch with additional event-driven capabilities. We'll use it for scheduling the jobs that run the GEL team's process for managing AWS resources.

With everything defined and running, we can monitor the job activity using Amazon CloudTrail for discrete events and Amazon CloudWatch for log groups and anomaly detection.

Under the covers for all of this is AWS Identity and Access Management. Getting the roles, permissions, policies, and trust relationships right is an important task that helps make this all work.

Select any image to see a larger version.
Mobile users: To view the images, select the "Full" version at the bottom of the page.

Build environment

To get started, we'll need an environment with the tools necessary to get things rolling. That might be your own desktop PC or perhaps a VM in the cloud. Wherever it is, make sure it has the utilities and network access needed.

> Docker

Did you notice above when I mentioned that AWS Batch is designed for running containerized workloads? That means we need to create a Docker Container Image as the base for running our project code.

Of course, we're not required to use Docker and could opt for an alternative like podman, containerd, or any other suitable option. But for this exercise, let's stick with running Docker Desktop as the example.

> AWS CLI

We also need the AWS command-line interface utility installed on your PC (or wherever you're building this project). The AWS CLI can be used exclusively, if desired, or in addition to the AWS Console web site. Where suitable, I'll try to provide instructions for both to help keep the relationship clear.

Make sure you can authenticate with the AWS CLI using "aws configure sso". If you're inside of the SAS network, instructions to help complete the prompts are provided as part of our corporate single sign-on at: https://go.sas.com/awsic. Once authenticated, select your AWS account and role and then click the link for "Command line or programmatic access".

To confirm:

AWS_ACCOUNT_ID="$(aws sts get-caller-identity --query Account --output text)"

echo "Your AWS account ID is: $AWS_ACCOUNT_ID"

Make sure the value returned is similar to: "123456789012". Else the instructions later won't work.

> Your project code

The objective is to get your project code containerized and made ready for execution in AWS Fargate. So, have that ready in a local directory.

As a placeholder for your project, we'll provide a simple example. Place the script below into a local file named "hello-aws.sh" on your PC:

#!/bin/bash

# My "Hello, AWS" script

# Very simple script:
# - use 'echo' to output any string
# - use 'aws'  to send command to AWS CLI
# - else will produce warning msg

while [[ $# -gt 0 ]]; do
    key="$1"
    case $key in
        echo)                           # task = echo
            shift                       
            echo -e "\n---\n$@\n---\n"  
            shift $#                    
        ;;
        aws)                            # task = aws CLI
            echo -e "\n---\n$@\n---\n"              
            shift                       
            aws $@                      
            shift $#                    
        ;;
        *)                              # for anything else
            echo -e "\n---\nWARNING: Unexpected input.\n---\n"
            shift $#                    
        ;;
    esac
done

This program is illustrative of two things I'd like to demonstrate:

Providing arguments to the program that direct its behavior
Using the AWS CLI from inside the AWS compute environment

Prepare to build the container

Let's build a container. To give Docker the instructions it needs, we'll create a file named "Dockerfile" with this content:

# Build this container to run in AWS Batch

# Start with the latest AWS-CLI v2 image
FROM public.ecr.aws/aws-cli/aws-cli:latest
# Consider using "--platform=linux/amd64" if building on Apple Silicon

# Install additional software as needed
RUN yum update -y && yum install -y wget unzip

# Copy your project code into the container
COPY hello-aws.sh /

# Default execution
ENTRYPOINT ["/bin/bash","/hello-aws.sh"]

# Default parameters
CMD ["echo","hello, aws!"]

Let's take a quick look at the Dockerfile directives:

FROM:

One fun thing about containers is that you get to choose your starting place. Have you ever baked brownies from a box? They've done a lot of the work for you with the flour, chocolate, baking soda, yeast, etc. You just need to add a couple more ingredients to finish the recipe. We're going to do the same thing here.

Because our GEL project makes heavy use of the AWS CLI, we'll opt to start with an Amazon-provided image that has the AWS CLI already installed. One really nifty benefit of this approach is that the AWS CLI in this image will automatically inherit the IAM role we're using when it runs -- no additional authentication required when using this container in AWS to run jobs later!

RUN:

You can run additional commands to do nearly anything to make changes to the container. In this case, we're installing some additional software that's not included in the AWS CLI image.

As a rule, images are often stripped down to the bare minimums to keep sizes small and security exposure minimal.

COPY:

Copies static files (or directories) into the image.

The first argument is a file (or directory) that must exist in your CWD. Docker won't let you cheat and provide a path to other locations elsehwhere in your PC's file system.

The second argument is where to place the file (or directory) using an absolute path reference inside the image. We're copying hello-aws.sh to the root (top) directory inside the container.

ENTRYPOINT:

This directive is what's called when you run the container. It will always be invoked with the command string given here (unless the --entrypoint override is given).

CMD:

These are the default parameters given to the ENTRYPOINT unless others are provided at container invocation.

In this case, if we run the container without any other command-line arguments, then it should echo out the string "hello, aws!".

Build and test the container

At this point on your PC, you should have two files in the current directory:

hello-aws.sh: our custom code "project"
Dockerfile: instructions for Docker to build our project's container image

Building the container is pretty easy at this point:

# the image 
MY_IMAGE="hello-aws"

# use this files in this directory (".") to build the image
docker build -t $MY_IMAGE .

With results similar to:

[+] Building 0.1s (8/8) FINISHED                                                               docker:desktop-linux
 => [internal] load build definition from Dockerfile                                                           0.0s
 => => transferring dockerfile: 432B                                                                           0.0s
 => [internal] load metadata for public.ecr.aws/aws-cli/aws-cli:latest                                         0.0s
 => [internal] load .dockerignore                                                                              0.0s
 => => transferring context: 2B                                                                                0.0s
 => [1/3] FROM public.ecr.aws/aws-cli/aws-cli:latest                                                           0.0s
 => CACHED [2/3] RUN yum update -y && yum install -y wget unzip                                                0.0s
 => [internal] load build context                                                                              0.0s
 => => transferring context: 892B                                                                              0.0s
 => [3/3] COPY hello-aws.sh /                                                                                  0.0s
 => exporting to image                                                                                         0.0s
 => => exporting layers                                                                                        0.0s
 => => writing image sha256:a9239b60aac68312b402c0f1d1345f5134a56f7fbbe8452e24dde6b2878105e7                   0.0s
 => => naming to docker.io/library/hello-aws                                                                   0.0s

Now you've got a container, but does it actually work like it's supposed to? Let's try it out:

No args:

docker run -it hello-aws


---
hello, aws!
---

Echo a custom string:

docker run -it hello-aws echo Break the rules. Keep the faith. Fight for love.

---
Break the rules. Keep the faith. Fight for love.
---

Try an AWS command:

docker run -it hello-aws aws account list-regions

---
aws account list-regions
---

Unable to locate credentials. You can configure credentials by running "aws configure".

That kinda worked, but it also failed. The script announces the AWS CLI command it's trying like we told it to, but then the AWS CLI complains it doesn't know who you are and wants you to configure authentication.

Earlier in this post where we covered the attributes of your build environment, I mentioned you need to have already installed the AWS CLI on your PC and authenticated using "aws configure sso". If that's already done, then we direct Docker to share your current credentials on your PC with the AWS CLI that's running inside the "hello-aws" container. We'll define a bind mount (using -v option) to do that, connecting your local .aws directory to the equivalent location for the root user inside the container.

Try that AWS command again - but with credentials mapped in from your PC:

docker run -it  -v ~/.aws:/root/.aws hello-aws aws account list-regions

---
aws account list-regions
---
	
{
    "Regions": [
        {
            "RegionName": "af-south-1",
            "RegionOptStatus": "DISABLED"
        },
        {
            "RegionName": "ap-east-1",
            "RegionOptStatus": "DISABLED"
        },
        {
            "RegionName": "ap-northeast-1",
            "RegionOptStatus": "ENABLED_BY_DEFAULT"
        },
... and many more ...

And finally it should produce a warning message when given an invalid parameter:

docker run -it hello-aws Do something else.

---
WARNING: Unexpected input.
---

At this point, we've got our "hello-aws" container built and it's working fine, passing all unit tests. 😉

When we run the container in AWS directly, we won't need to include the bind mount for credentials (or authenticate it otherwise). It'll automatically inherit our account role in AWS.

Take a breath

We've made some significant progress at this point. We evaluated our project code to determine its runtime behavior and resource requirements. With that, we've started down the path of using AWS Batch to run our project using Fargate as a serverless compute environment due to its simplicity and low-cost.

Then we containerized our project code and confirmed it runs as designed. This particular container is built from an AWS image that already has the AWS CLI built in. We've also confirmed that the container's AWS CLI is able to function as expected.

In Part 2 of this series, called "Deploy and Run", we'll continue with registering and pushing the "hello-aws" container repository up to AWS ECR where it can be referenced to run by the AWS Batch Service. To do that, we'll define a compute environment in Fargate and create job definitions that reference our project container. Then we can submit ad-hoc jobs and specify overriding parameters to the container so it produces the output we desire.

Part 3, "Schedule and Monitor", will pick up from there to establish a schedule to run the job on a regular basis. Then we can monitor its activity using tools from CloudWatch and CloudTrail.

Find more articles from SAS Global Enablement and Learning here.