Hi everyone.
SAS 9.4 physical grids allow customers spread large compute workloads across a number of small, cheaper, servers. This avoids having to purchase a large single server, multiple servers provide higher availability and the grid can be expanded over time to keep pace with growing demand.
A significant challenge for physical grids, however, is accommodating workloads that peak at certain times. Nightly batches or month end processing frequently require a large number of grid nodes to meet the batch window, but once the workload has been processed these nodes are no longer required. As a result, they are underutilized for the rest of the time.
With Viya Workload Management (WLM) you get the advantages of a regular SAS grid and you can now scale to meet an increase in demand. Critically, you can also scale down when resources are no longer required. The cost savings are significant.
In this blog I explore SAS WLM integration with the Kubernetes Cluster Auto-scaler. I’ll show how we can sale up to process a batch workload and then scale down once the workload has finished.
Adding a Kubernetes nodepool for batch workload
Hosts are Kubernetes nodes that run jobs issued from SAS Workload Manager. For a host to be considered for jobs they need to have the Kubernetes label workload.sas.com/class with a value of compute. This is assigned when you define the nodepool. The default installation has a nodepool called compute. By default hosts from this nodepool will run all jobs submitted to the grid.
For this exercise I have decided to add a new host type that will run alongside the standard compute host type. This is useful as I want to pick a different type of machine for this workload. The jobs in question use SAS work a lot. The machine type I have chosen has multiple Solid State Drives which will be assigned to SAS work. This significantly improves performance. Machines with SSDs are expensive so I don’t want them around when not in use. Hence the need to scale up and back down.
I am assuming a standard deployment of Viya that includes WLM. In this example I am using Google Cloud Platform (GCP), but the instructions should be similar for other cloud providers.As I said I am going to use a dedicated nodepool for my batch workload. This will start from zero and eventually return to zero. (It should be noted that the default compute nodepool can also be configure to auto-scale.)
The new nodepool will be called batchcompute. I’ll base the new nodepool definition on the standard compute nodepool definition, but a few changes are required. Firstly, an additional Kubernetes label is needed. This is to enable WLM differentiate these hosts from the regular compute hosts. While this label and value can be anything, you should pick something meaningful, I use workload.sas.com/wlm and I assign it a value of batch. I am using the GCP machine type n2-standard-16 and I add 4 SSDs which I’ll use for saswork. For the rest of the settings I copy all the other labels, taints etc. from the standard compute nodepool definition.
Of course I need to enable auto scaling for the nodepool and set the maximum and minimum node values. Since we want to scale up from, and down to, zero, the minimum node value will be 0. The maximum node is the maximum number of nodes that you want to run at any time. For this exercise I set it to 3.
This is the GCP gcloud command I used to create the nodepool. You can always use the GCP GUI if you prefer.
gcloud beta container --project "sas-fsi" node-pools create "batchcompute" \
--cluster <<YOUR_CLUSTER>> --region "us-east1" \
--node-labels=workload.sas.com/wlm=batch, workload.sas.com/class=compute, nodepool=batchcompute, launcher.sas.com/prepullImage=sas-programming-environment \
--node-taints=workload.sas.com/class=compute:NoSchedule \
--machine-type "n2-standard-16" --disk-type "pd-ssd" --disk-size "200" --ephemeral-storage local-ssd-count=4 \
--enable-autoscaling --num-nodes "0" --min-nodes "0" --max-nodes "3"
Enabling ClusterRole and a ClusterRoleBinding
The SAS Workload Orchestrator daemons require information about the nodes used to run jobs. This includes reading Kubernetes labels, which we will need to do. In order to obtain this information, we need some elevated privileges which are not present as standard in a standard Viya deployment. To do this you must add a ClusterRole and a ClusterRoleBinding to the SAS Workload Orchestrator service account. See here for more information.
Checking your environment
Once you have your deployment up and running it’s best to check that everything is as expected.
Check for role bindings:
kubectl get clusterrolebindings | grep sas-workload
Output expected:
sas-workload-orchestrator-<namespace> ClusterRole/sas-workload-orchestrator
Configuring WLM using Environment manager
Log into Environment Manager and navigate to the WLM section. Select the Hosts tab, you should see the standard compute node. It will have a host type of default. At this stage you should not see any other hosts, remember we are scaling from zero. If we had set the minimum nodes to 1 or more we would see them here too.
Figure 1 Starting from zero
Defining a new host type in WLM
In order to utilize the new nodepool we need to create a new host type in WLM. Select Configuration, then Host Types and click “New host type”.
Give the host type a name and a description. Under Host Properties, add the label we added to the nodepool, in this case it was workload.sas.com/wlm=batch. Tick Enable Autoscaling. We also should set the maximum number of jobs that a host can run. This can be tuned later but for now let’s set it to 8, this will enable us fill hosts quickly which is better for demonstration purposes. Save the new host type.
Note that we don’t use the IP or host name options here, as these would change every time the nodepool is scaled down and back up.
Figure 2 New host type configuration
Setting up a queue to use the new host type
Navigate to the Configuration tab. Select queues and select New queue. Give it a meaning full name (batchqueue) and description. In the Host types field enter the host type we assigned above (batchhost). Enter a priority, anything for now, and save your new queue.
Figure 3 Batch queue configuration
Notice the new auto-scaling properties. These allow you tune how quickly the system scales up. We will leave them at default for now. For more information on WLM configuration see the documentation here.
Submitting jobs
To submit a job in batch mode I use the sas-viya cli. If you haven’t already done so, you will need to install it. See here for instructions. Make sure to add the batch plugin. If you already have the cli, but it’s been a while since you installed it, this would be a good time to update it and any plugins. See here for instructions. Ensure you have created a profile and have successfully authenticated.
To simplify submitting a number of batch jobs I use the following script.
#!/bin/bash
#
# Submit batch jobs
#
timestamp() {
date +"%T" # current time
}
#
sas_prog=/r/sanyo.unx.sas.com/vol/vol420/u42/sukeob/GridTests/lr2.sas
#
if [ "$#" -ne 3 ]; then
echo "Illegal number of parameters"
echo "Usage submit_batch "
exit 2
fi
#
echo ">>> Submitting $1 batch jobs >>>"
for ((job=1;job<=$1;job++))
do
echo -n "Starting batch job# $job of $1 at "
timestamp
~/sas-viya batch jobs submit-pgm --pgm-path $sas_prog --queue-name $2 --context $3 --restart-job
done
In this example I use a SAS program called lr2.sas but any valid SAS program will do. Just make sure it executes for long enough to fill hosts with jobs so that additional hosts are required. For demonstration purposes I submit 40 jobs. This should fill the 3 hosts; remember we set max jobs to 8 jobs per host and max nodes is set to 3. That’s a maximum of 24 jobs running concurrently. (There are a few spare jobs in case jobs start to finish before the 3rd node is provisioned.)
Scaling from zero
Before we submit the jobs let’s check to ensure that we are starting from zero. Using SAS Environment Manager we can see only the default compute host, no jobs are running and none are being queued.
Figure 4 Hosts at the start of the batch
Figure 5 Queue state while waiting on 1st host
First Host
Using the script above we submit the 40 jobs (submit_batch.sh 40 batchqueue default).
Once the jobs are submitted to the batchqueue queue WLM identifies that this queue requires a host type of batchhost. However none of this type of host are available. WLM makes a request to Kubernetes for one of these nodes. It takes a few minutes for the node to be ready to accept jobs, once the node is available WLM deploys some of the queued jobs on to it.
Figure 6 1 st Host available and full
Figure 7 Queue state while waiting on 2 nd Host
We can see that 8 jobs have been submitted and the new host is now full. WLM will continue to queue the remaining 32 jobs and requests another host.
Second Host
When the next host is available, WLM will deploy some more of the queued jobs onto to it.
Figure 8 2 nd Host available and full
Figure 9 Queue State while waiting on 3rd Host
We can now see that 16 jobs are running and the 2 hosts are full. WLM will continue to queue the remaining 24 jobs and request yet another host.
Third host
Finally we get all 3 hosts running and WLM deploys more jobs. So we now have 24 jobs running. We are at the max number of hosts so WLM will not request any more hosts. Instead it will queue the remaining jobs until slots become free on one of the three nodes. (Some jobs have already completed and we have 8 jobs left in the queue)
Figure 10 All 3 hosts available and full
Figure 11 Remaining jobs waiting for a free slot
Scaling back down
When all the jobs have completed, we can see the 3 batch hosts are still available, but are not running any jobs.
Figure 12 Hosts waiting for more jobs
If the hosts remain idle, Kubernetes will eventually scale them down.
Figure 13 All 3 hosts have been scaled down.
So that’s it, we are back to where we started, with no hosts. With WLM we are able to provision the compute resource we required but only for as long we needed it. This is very powerful and will dramatically reduce total cost of ownership for SAS WLM customers. It can also help reduce batch windows as customers will be more willing to assign additional resources to get jobs processed quicker, if they can scale down when the batch is finished.
There is lots more to consider when deploying WLM but that’s all from me for now.
Eoin
... View more