High Availability considerations for SAS Container Runtime

1 Like

Following on from a recent post on “High Availability considerations for MAS on SAS Viya” I felt it would be good to look at the considerations for SAS Container Runtime. In this post I will look at using Availability Zones in the Microsoft Azure platform.

As a SAS Container Runtime image is an OCI compliant container image, there are few constraints when it comes to running the Container Runtime images.

Let’s look at some of the considerations…

A good place to start, when thinking about high availability, is to understand the dependencies for running the container image. From a SAS Container Runtime perspective, we need to understand if there are any database dependencies (any database connections).

If a SAS Container Runtime image does rely on connections to external databases, then we should also ask if that source is highly available. As the overall availability of a system is only as good as the weakest link. The least available resource within the process chain (the upstream or downstream systems).

Another consideration is whether the SAS Container Runtime image and the database should be collocated. Often a database is in a separate security zone, but to minimise latency for real-time transactions, a good practice is to collocate the model with the data (if possible). This is important to understand when considering the use of Availability Zones.

Now let’s look at some of the options for availability when running in Azure.

For this we will look at running on the Azure Kubernetes Service (AKS) platform, but there are other options, such as Azure Container Instances (ACI). Both of which have options to provide redundancy (HA) for a deployment.

When deploying to AKS, the simplest option is to do the following:

Deploy the model or decision flow image using a Kubernetes deployment, and specify multiple replicas.
Use multiple AKS nodes, by implementing “pod anti-affinity”.
Consider dedicating a node pool (nodes) to running the Container Runtime pods.
If Container Runtime pods are running on nodes with other workloads consider using Guaranteed Quality of Service.

This provides redundancy in the runtime environment. But you could take this to the next level by using Availability Zones. While SAS Viya doesn’t officially support the use of Availability Zones, the SAS Container Runtime images are independent (self-contained) images, and we can deploy multiple instances to run in different Availability Zones.

Many Azure regions provide availability zones, which are separated groups of data centers within a region. Each availability zone has independent power, cooling, and networking infrastructure. This is depicted in the following diagram from the Microsoft documentation. Also see the “Azure regions list” for information on which regions offer availability zones.

Source: Microsoft Azure documentation

Select any image to see a larger version.
Mobile users: To view the images, select the "Full" version at the bottom of the page.

A key aspect to understand is how many zones are available within a region, as not all regions have the same number of Availability Zones.

Using multiple available zones

For my testing I used the SAS Viya 4 Infrastructure as Code (IaC) for Microsoft Azure GitHub project to create the AKS cluster. The IaC only supports configuring multiple availability zones for the default (system) node pool. You do this using the default_nodepool_availability_zones parameter. You would also want to set the minimum number of nodes. For example, to use three zones.

default_nodepool_min_nodes = 3
default_nodepool_availability_zones = [“1”, “2”, “3”]

After the AKS cluster was built, I used the Azure Portal to add a new node pool, called “models” that used three availability zones. I could have also used the Azure CLI to do this, with the 'az aks nodepool add' command. You use the “- -zones” parameter to specify the zones to be used. It is a space separated list (- - zones 1 2 3). For example:

az aks nodepool add --resource-group ${resource_group} --cluster-name ${cluster_name} \
--name models \
--node-vm-size Standard_D8s_v4 \
--os-type Linux \
--node-osdisk-size 200 \
--labels workload/class=models \
--node-taints workload/class=models:NoSchedule \
--enable-cluster-autoscaler \
--min-count 3 \
--max-count 3 \
--zones 1 2 3

After doing this, I had my Kubernetes cluster dedicated to running the model images. It had a “system” node pool and the “models” node pool.

Using the ‘kubectl get nodes’ command, using the “-L” option allows you to display the zone that a node is running in. In Azure the nodes have the label: topology.kubernetes.io/zone

For example.

In the image you can see that I had three models nodes and three system nodes. The nodes are running in zones: eastus-1, eastus-2 and eastus-3.

The models nodes also have a custom label and taint applied.

Ingress Configuration

To run the SAS Container Runtime pods you need an ingress. I configured the ingress controller to have 3 replicas. This gave me one ingress controller pod running on each node (due to the pod anti-affinity setting). For example.

In the image you can see they are running on the system nodes.

SAS Container Runtime deployment

For this testing I was using the Model Manager Quick Start Tutorial examples, and I configured the ‘qstree1’ model to have 3 replicas. Providing a toleration for the models taint and configured to have pod anti-affinity.

In the image you can see the pods are running on the models nodes.

Testing the environment

Now that I had an ingress controller and the qstree1 pods running in each available zone. It was time to test the configuration.

I used CURL to test the configuration, to call (run) the qstree1 model. For this, I wanted to see which pods (and nodes) were being used.

I used the ‘kubectl logs’ command, with the “-f” option to follow the log messages. Which was fine for the ingress controller pods, but I had to configure debug logging for the Container Runtime pods to see the curl session traffic.

For SAS Container Runtime, the logging is configured using environment variables. As I wanted to see the curl session traffic, the best option is to configure the SAS_SCR_LOG_LEVEL_SCR_IO environment variable, and set it to “DEBUG”. For example, within the deployment manifest you need to add the following.

env:
  - name: SAS_SCR_LOG_LEVEL_SCR_IO
    value: "DEBUG"

This ensures that the cURL session traffic is written to the container log.

I then used curl to run the model. The diagram below shows a summary of some of the session traffic and the pods/nodes that were used on each invocation of the model.

Here you can see that the session traffic is using multiple Availability Zones.

Finally, to show example log output. The following is the cURL command that I used for testing:

curl --location --request POST 'http://'${INGRESS_FQDN}'/qs_tree1' --header 'Content-Type: application/json' --header 'Accept: application/json' --data ' {
"inputs": [
{ "name": "CLAGE", "value": 94.36666667 },
{ "name": "DEBTINC", "value": 0 },
{ "name": "DELINQ", "value": 0 },
{ "name": "DEROG", "value": 0 },
{ "name": "VALUE", "value": 39025 }
]
} ' | jq

Here is a sample of the ingress logs, showing the request to run the qs_tree1 model:

The SAS Container Runtime log showed the following:

The output from running the model is shown in the highlighted (red) text.

Conclusion

In this post we have looked at using available zones in Azure for the SAS Container Runtime images. As you have to create the manifest to run the Container Runtime pods, you have complete control over the deployment. Including the number of pod replicas, pod and node affinity, and the ability to make use of Availability Zones.

Always remember to check the Azure documentation to confirm the Availability Zone support, and the number of zones that are available in the region.

I hope that helps, and thanks for reading.

Find more articles from SAS Global Enablement and Learning here.