Determine how many SAS Viya analytics pods can run on a Kubernetes node – part 2

3 Likes

Since August of 2023, the SAS Viya platform now includes the SAS Workload Management offering and activates it by default at initial deployment. If you're familiar with grid computing from SAS 9, then you're well aware of the powerful capabilities that this brings to SAS Viya. Now the concepts of jobs, queues, priority, and preemption can be offered at the user level for SAS Viya, and are no longer exclusively in the domain of the Kubernetes administrator.

This post is the second in a three-part series where we investigate how many SAS Compute (as a representative example of the SAS Programming Runtime Environment) can run in the Kubernetes cluster. Continuing from where we left off, we'll now look at the SAS Workload Orchestrator and how it functions to manage the SAS Compute workload.

In this post, we'll often look specifically at the SAS Compute Server pod. The items discussed here apply equivalently to the other related SAS Programming Runtime Environment servers, like SAS Batch and SAS/CONNECT.

[ Part 1 | Part 3 ]

Schedulers, briefly

Kubernetes includes a control-plane process known as kube-scheduler. It's a Kubernetes Controller responsible for assigning pods to nodes. It evaluates numerous attributes of the pods and nodes, ranks the best fit, and then binds the pod to the target node.

Select any image to see a larger version.

Mobile users: To view the images, select the "Full" version at the bottom of the page.

Effectively, there are two axes of considerations for scheduling pod placement:

Affinities: this covers everything that determines how pods relate to nodes and each other, including labels (and node selectors), taints (and tolerations), pod affinities (and anti-affinities), and so on.
Quality of Service: Kubernetes has defined levels of QoS (Guaranteed, Burtstable, and BestEffort) but I’m also including other items under the umbrella that inform QoS such as pod requests and limits for CPU and RAM as well as the available host resources which Kubernetes monitors (like storage space, CPU, RAM, etc.).

With SAS Workload Management, we have additional scheduling capabilities to apply to Viya's dynamically-launched processing pods, like SAS Compute Server.

This provides a third axis:

SAS Workload Management: introduces the concepts of jobs, queues, priority, and pre-emption to the Kubernetes realm as applied to SAS Compute Servers (and other SAS Programming Runtime Environment servers like SAS Batch and SAS/CONNECT). It also provides a way to specify and track resource utilization and establish thresholds and other factors that impact how jobs are scheduled to various hosts.

One nifty aspect of Kubernetes is that extensibility is provided – and support for custom pod schedulers is included. SAS Workload Management provides the SAS Workload Orchestrator to act as the scheduler for dynamically-launched processing pods, like SAS Compute Server. In a Kubernetes cluster where SAS Workload Management is active, then SAS Workload Orchestrator is responsible for scheduling all of the SAS Compute Server pods independently of the kube-scheduler.

SAS Workload Orchestrator is constrained in two fundamental ways:

It only manages Viya's dynamically-launched processing pods of the SAS Programming Runtime Environment as instantiated by SAS Compute, SAS Batch, and SAS/CONNECT Servers as well as other pods running Python or R as started by the Batch service. That is, it doesn't schedule any other Viya pod workloads (as of stable-2023.11, subject to change).
It can only schedule pods to run on nodes labeled as "workload.sas.com/workload=compute"

For these reasons, SAS Workload Orchestrator implements its own logic as defined by jobs, queues, priority, etc. when scheduling SAS Compute Server pods to the "compute" nodes of the Kubernetes cluster. Unless otherwise directed, it effectively ignores the typical Kubernetes customization attributes such as node labels and taints, pod affinities (and anti-affinities), and so on.

It must, however, handle the pod and node attributes that correspond to physical resources. This means that SAS Workload Orchestrator will evaluate the total composite resource requests a dynamically-launched processing pod defines for CPU and RAM. It is also subject to infrastructure constraints such as the maximum pods allowed per node (configured for the kubelet as driven by the cloud-provider's managed Kubernetes service - in AWS, for example, this ranges from 4 to 737 pods depending on instance type).

Illustrating the SAS Compute Server pod

The items discussed here apply equivalently to the other related SAS Programming Runtime Environment servers, like SAS Batch and SAS/CONNECT.

Like other pods, the sas-compute-server pod includes both init containers and regular containers to perform its function.

Init containers in sas-compute-server pod:

sas-certframe: sets up encrypted communications for this pod with the rest of SAS Viya
sas-config-init: prepares the environment for the main container

Regular containers in the sas-compute-server pod:

sas-programming-environment: this is the main container and why this pod exists - it provides the runtime engine for executing SAS program code
sas-process-exporter: this is a sidecar container that helps out pod operations. It only exists when SAS Workload Management is active such that this container is injected into the pod by SAS Workload Orchestrator when it's created to track key metrics and other information.

Notice that each container shown here is defined with requests and limits for both CPU and RAM. For pods in general, that's not required (some pods might have no requests or limits defined at all), but is helpful to ensure that the sas-compute-server pod can be managed intelligently with the available environmental resources. The values shown above are the defaults, but can be modified to suit the needs of your environment.

Adjusting the SAS Compute Server's requests for CPU and RAM

SAS Environment Manager provides a path to edit the requests and limits for CPU and RAM for SAS Compute (and other SAS Programming Runtime Environment servers like SAS Batch and SAS/CONNECT) through the SAS Launcher configuration.

Open SAS Environment Manager > Configuration > View: All Services > select Launcher service > Edit sas.launcher.default Configuration:

Here you can see the defaults that will be used for requesting (and limiting) CPU and RAM usage by the sas-programming-environment container in the associated pod for SAS Compute, SAS Batch, or SAS/CONNECT. In this example, the container will request 50 millicores of CPU initially with a limit of 2 CPU cores and it will request 300 Megabytes of RAM initially with a limit set at 2 Gibibytes. Alternatively, advanced configuration of these values (for launcher contexts and associated compute contexts) can be made to the podTemplate.

SAS Compute's variable workload

The kind of work that dynamically-launched processing pods, like the SAS Compute process, can perform is widely varied. Depending on the task it's assigned, it might perform any kind of work from very tiny to very large. It might simply sit idle waiting for a new task or it might crunch through large data sets applying specialized analytics. Sometimes it might run auto-generated code as directed by a SAS Viya client, like SAS Model Studio, or it could run user-generated ad-hoc code submitted from SAS Studio or scheduled as part of a batch process. If using MP CONNECT technology, then one SAS Compute could programmatically spawn off a number of additional SAS Compute as child processes that run serially and/or in parallel.

This means setting the initial requests (and optionally, limits) for the SAS Compute pod can be challenging to fine-tune to a perfect ideal. Unless you're dealing with a well-understood repeatable procedure, then it's generally recommended to embrace this variability in a few ways:

Set requests that start small, but allow for growth as needed (with high limits or none at all). Setting requests for containers in the SAS Compute pod to achieve the Kubernetes Burstable quality of service is more flexible and more effectively ensures sufficient resource utilization.

Define compute contexts and partner with users on their appropriate application such that SAS Compute pod definitions are better tuned for the expected workload.

Employ SAS Workload Orchestrator's job, queue, and host attributes to schedule jobs to hosts best suited to the task(s) at hand.

One more thing when it comes to the sas-programming-environment pod. As the SAS runtime, it is internally configured by a number of SAS system options, and of particular interest here: MEMSIZE. MEMSIZE can be set in SAS Environment Manager as an invocation option that specifies the limit on the total amount of memory that the SAS session will try to use. The default value is 2 GB (hence the matching limit of 2 GB for the sas-programming-environment pod above). When planning for Kubernetes resource requests and limits, it’s recommended when changing these settings to keep them consistent to their purpose at all levels relative to each other (in the SAS runtime, in the SAS pod spec, in Kubernetes, and in infrastructure to host it).

Coming up

In the last post of this series, we'll figure out how many SAS Compute Servers can run on a given node and what the constraining factors are.

H/T

I'd like to tip my hat in gratitude with warmest regards to several people who helped explore this topic with me: David Stern, Scott McCauley, Edoardo Riva, Raphaël Poumarede, Joe Hatcher, Craig Rubendall, Seth Heno, and Doug Haigh. Any mistakes are my own, not theirs. 😉

Find more articles from SAS Global Enablement and Learning here.