The SAS Viya platform includes by default (and for free!) SAS/CONNECT with all its capabilities. Previous articles, including Moving to the cloud with SAS/CONNECT, have shown SAS/CONNECT evolution to adapt to cloud environments and support our customers in their journey to modern architectures. Innovations keep coming! Starting with SAS Viya version 2023.10, SASCMD sign-ons (a.k.a. MP Connect sign-ons) always start new SAS sessions in new pods by default.
Let's start by stepping back a moment to see what MP Connect is, and why this is a welcome change.
Before cloud computing, before any kind of multi-machine distributed computing, SAS customers had no choice but to run their code on a single machine (yes, you can tell I’ve been at SAS for quite some time).
SAS capabilities have always been disruptive, even in those limited environments. Multi-process CONNECT (or MP Connect) was conceived to divide time-consuming tasks into multiple units of work and to execute these units of work in parallel. It did that – and still does – by providing a framework that lets you start and coordinate multiple child SAS processes from a controlling parent SAS session. This way you can parallelize code that otherwise runs sequentially, for the purpose of reducing the total elapsed time necessary to execute a particular application.
An example of a process split into multiple parallel subprocess.
Select any image to see a larger version.
Mobile users: To view the images, select the "Full" version at the bottom of the page.
MP Connect capabilities are automatically available in SAS Viya platforms, because they are provided by the SAS/CONNECT product, which is included by default in every SAS Viya environment. What happens when you move your existing MP Connect code to SAS Viya? Apparently, it's business as usual: child processes are spawned from the parent session to execute part of your code in parallel. But, under the hood, the execution environment is completely different. What used to be running on a physical machine, now runs in Kubernetes, inside a pod.
MP Connect spawning child sessions in a compute pod.
Maybe your original machine had 8 cores and 64GB of RAM (not too much, in modern terms) and your code is written to take full advantage of that computing power by spawning 7 additional sessions that use, in total, almost 100% of all CPUs. Kubernetes is built to control the execution environment and prevent a single pod from exhausting the resources of the node where it's running. As soon as your code starts requesting all that CPU power, Kubernetes will throttle down your pod - in a default configuration, down to 2 CPUs maximum. As a result, your existing code will run 4 times slower! It's obvious that this setup does not scale as expected. You could argue that the issue can be easily fixed by configuring the Kubernetes cluster to give more resources to SAS compute and connect pods. Yes, this solution could work, but it suffers of two problems:
What could be a better solution? Obviously, embracing a cloud-native design and scaling out to multiple pods!
If you've been following so far, you can now understand how the new default behavior is welcome in cloud environments. Starting with SAS Viya 2023.10, every new MP Connect sign-on always starts a child SAS process in its own dedicated pod, embracing cloud-native scalability and elasticity. Kubernetes can spread out the pods on multiple nodes, and, if the cluster is configured for auto-scaling, your limit is the sky... or better, your budget!
MP Connect launching child sessions in dedicated connect pods.
Every time a new default is introduced in existing environments, it's important to give SAS Administrators to option to embrace this new capability, or to reset the SAS Viya platform to behave just like before. In this case, you can use a new environment variable, SAS_LOCAL_MPCONNECT. When set to true, it re-enables local MP Connect Sign-Ons, i.e. the original functionality of spawning a child session in the same pod where the parent process is running.
A SAS Administrator can use SAS Environment Manager to set the SAS_LOCAL_MPCONNECT environment variable to true in sas.compute.server: startup_commands and sas.connect.server: startup_commands configuration instances. In this case, the setting is configured for every SAS compute and connect server running in the environment.
Setting the SAS_LOCAL_MPCONNECT option for all compute server sessions.
A more limited scope could be achieved by setting the option case-by-case, as needed. As an end-user, you can add the following line in your code just before submitting the SIGNON statement:
In this case, only the current execution reverts back to the previous functionality.
This new capability seems the obvious choice when you are writing code that uses MP Connect in SAS Viya. So why would you revert back to the previous functionality? The most obvious answer is when you are migrating existing code that could be broken by the new behavior. It's easy to understand that if the existing code uses local resources to share data between MP Connect sessions, it cannot work as-is when these sessions are launched in different pods. Here are a couple of examples:
Those issues can be solved by re-architecting your application - for example, by sharing data through Kubernetes volumes mounted on all pod sessions, instead of using local directories. Yet, reverting back to the previous behavior can be an interim step to keep the code running during your migration.
I am always excited when I see how SAS Viya keeps evolving by embracing existing functionality and integrating it into modern cloud-native architectures. We have seen in this post how traditional MP connect code can be used without sacrificing Kubernetes capabilities to manage resource utilization, provide scalability when required, and ensure user-level process isolation.
Find more articles from SAS Global Enablement and Learning here.
Registration is open! SAS is returning to Vegas for an AI and analytics experience like no other! Whether you're an executive, manager, end user or SAS partner, SAS Innovate is designed for everyone on your team. Register for just $495 by 12/31/2023.
If you are interested in speaking, there is still time to submit a session idea. More details are posted on the website.
Data Literacy is for all, even absolute beginners. Jump on board with this free e-learning and boost your career prospects.