About MichaelGoddard

MichaelGoddard

In this post we will look at using SAS Enterprise Session Monitor (ESM) to monitor SAS Container Runtime deployments running on Kubernetes. The published SAS Container Runtime container images, the published models and decisions, can run on many platforms. If we choose to run them in Kubernetes, then it is possible to use monitoring tools such as Prometheus/Grafana and SAS Enterprise Session Monitor. Let’s take a dive into the world of SAS Container Runtime observability using SAS Enterprise Session Monitor. To run the published model or decision, you must create the manifests. This gives you flexibility in the naming of the pods that are running the images and other configuration elements. For example, you can set up a Kubernetes Deployment definition to run the SAS Container Runtime image. This makes it very easy to create an environment that ESM can monitor. We will start by discussing the ESM configuration. ESM Configuration SAS Enterprise Session Monitor provides a suite of functions that allow you to look at and/or filter by workload type. The Dashboard that you see on login has a ‘Load by Type’ portlet, which shows the detected session types. While this portlet will discover the SAS Viya processes by default, it is possible to configure the ESM Agent for additional workload types (session types), such as the SAS Container Runtime images (pods). To filter on specific pod types and define a custom name, the ‘esm-agents.yaml’ file has a set of regular expression statements that can be updated to meet your specific needs. The filters are defined under the ‘pod_types:’ section. To monitor the SAS Container Runtime pods you need to update the filters for these pods. Two factors come into play here. The first is that any filter (the regular expression) must be able to find a unique match for the specific pod. The second is that this in turn means that the pods of a specific type must have a unique name to be separately identified or at very least, have a name that a regular expression can target. This brings me back to my opening comment. You create the manifests to deploy the SAS Container Runtime images, so you have complete control of the naming of the pods. Therefore, a little planning is required, but having a naming standard for the SAS Container Runtime deployments (pods, services, ingress, etc) is always a good thing. Let’s look at an example… Using a text editor, I updated the ‘esm-agents.yaml’ file with the following setting: pod_types: - pod_log_level: WARN pod_regex: .*scr.* pod_type: SCR This meant that any pod name containing “scr” would be assigned the type ‘SCR’. When defining the rules, it is important to understand that they are processed sequentially, from top to bottom. So, you need to define the most specific rules first in the list. For example, to target any pod starting with “scr” you would use: pod_types: - pod_log_level: WARN pod_regex: ^scr.* pod_type: SCR Using this as an example, if you had both definitions (with different pod_type names) in the ESM Agent definition, this would have to be placed before the first example. Here is a link to Microsoft tutorial on the Regular Expression Language. I then deployed the ESM server and agents to my Kubernetes cluster. SAS Container Runtime deployment For my testing I used two models from the SAS Model Manager Quick Start Tutorial: the QS_Reg1 and QS_Tree1 models. I ran the models using a deployment, with 2 pod replicas for each of the models. Here is a snippet of the QS_Tree1 manifest showing the name for the model pods. --- apiVersion: apps/v1 kind: Deployment metadata: labels: app.kubernetes.io/name: qstree1 workload/class: models name: scr-qstree1-model spec: # modify replicas to support the requirements replicas: 2 selector: matchLabels: app.kubernetes.io/name: qstree1 template: metadata: labels: app: qs-tree1 app.kubernetes.io/name: qstree1 workload/class: models spec: affinity: In the example above, you can see that the pods have a name of “scr-qstree1-model”. I created a similar deployment for the QS-Reg1 model. But for these pods I used the name “qsreg1-scr-model”. Hence, I had the need to use more of a wild-card definition in the ESM Agent setup, looking for ‘scr’ anywhere in the pod name. I then deployed the models and created a bash script to generate some workload. Let’s look at the results in ESM. Using SAS Enterprise Session Monitor For my testing I was working in Azure and created a dedicated node pool for running the models. This included the SAS Container Runtime pods and the SAS Micro Analytic Score (MAS) pod. I will not go into the details of using ESM but will show some screenshots to illustrate monitoring the SAS Container Runtime sessions. In this first image you can see the ESM dashboard. I have selected the “models” node (aks-models-24548859-vmss000000). Select any image to see a larger version. Mobile users: To view the images, select the "Full" version at the bottom of the page. Looking at the image, you can see that I now have a session type of ‘SCR’, you can see this in the “CPU Performance and Session Count” and “Load by Type” portlets. Looking at the “CPU Performance and Session Count” portlet you can see the CPU usage and that I had 4 pods (sessions) running. If you look at the “Load by Type” portlet in the dashboard image it shows the various workload types (session types) that have been detected. As the node is dedicated to running models, the SAS Container Runtime pods, you can see the ‘SCR’ session type and the system / Kubernetes resource consumption. There also a small amount of workload tagged as “viya” this was from the SAS Micro Analytic Score pod that was running on the node. I’m getting a little off track here, but if you also wanted to differentiate the SAS Micro Analytic Score pod(s) you could use the following expression: - pod_log_level: WARN pod_regex: ^sas-microanalytic-score.* pod_type: MAS This would allow you to clearly identify the pods that are running the models and decisions, for both SCR and MAS. Coming back to monitoring the SAS Container Runtime deployment. Using the Timespans function allows you to drill in on a node. As I described earlier, all the SAS Container Runtime pods were running on the models node. Here we can see the processes running on that node. As I was focusing on the SAS Container Runtime pods, I used the ‘Category’ selector to filter on the ‘SCR’ session type that I defined. I then used the ‘Group By’ function on the ‘Process List’ menu to group the process list by pod id (name). For example: As you can see, this removed all the unwanted processes, allowing me to just focus on the pods running my models. Now it is very easy to get information on the pods. In the ‘Process List’ you can see that I had 2 replicas of the QS_Reg1 model and 2 replicas of the QS_Tree1 model running. The node portlet is showing summary information for the ‘aks-models-24548859-vmss000000’ node. By hovering the mouse pointer over the graph, we can see that there is 9.7% CPU usage and 7.25% memory usage. This is a live view of the system. Using ESM there are many ways to dive into the details. The next image is showing a view of the ‘Subsessions for java’. Under Live View, I selected Distributed Search, which allowed me to focus on the java processes, the SAS Container Runtime processes. In the ‘Process List’ you can see the java process running in each SAS Container Runtime pods. On the bottom left we can see a Heat Map, and on the right we can see the CPU being used by each of these processes. Here we can see that process ID ‘256730’ is using the most CPU, this is for the first ‘scr-qstree1-model’ pod. The red arrows are highlighting the process ‘256751’, this pod is using 2% CPU. It was also possible to see the memory consumption using the Heat Map on the left. Finally, from this view you can get additional information on several performance metrics (CPU, memory, IO, page faults, etc). In this final image you can see that I have selected ‘Major Faults’, the graph is now showing the page faults for each pod. Within this context, a ‘Minor Fault’ is where the process is trying to access a block of data that is not in the CPU cache, but it is in memory. If the data was not in memory at all this is a ‘Major Fault’. Here you can see the faults being detected with the 4 model pods. Ideally, you want to see very few major faults, as the more major faults a process has, the less performant it’s going to be, since the system must wait (CPU wait) while the IO subsystem returns the requested page of data. From this view we can see that the process under the greatest stress is ‘256730’, with 2 major faults/sec. From the Process List we can see that this is one of the QS_Tree1 pods. This type of information is important when trying to understand the performance of the running model or decision at a system level. For this test I had several bash scripts that were generating calls to score data using the QS_Tree1 and QS_Reg1 models. While I don’t have an exact number, I was probably generating around 7 to 10 transactions per second for the QS_Tree1 model. Conclusion It was very easy to configure the ESM Agent to monitor the SAS Container Runtime pods. We can see that ESM provides valuable insights into the system performance of each model or decision being executed. As can be seen here, this can help with system tuning and identification of possible problems. I see this being particularly valuable when the model or decision is being integrated within a real-time business process. I hope this post has given you a sense for what is possible when using SAS Enterprise Session Monitor to monitor the running models and decisions. Finally, while I haven’t tested monitoring the SAS Container Runtime Batch Agent, it should be possible to take the same approach. That sounds like a post for another day 😊 Find more articles from SAS Global Enablement and Learning here.

MichaelGoddard

With SAS Viya 2025.09 (September 2025) one of the enhancements is the introduction of the SAS Container Runtime Batch Agent. This is a significant update in the way you can deploy and run the published images for models and decisions. The Batch Agent is aimed at the requirement to process large batches of data using a model or decision that has been published as a SAS Container Runtime container. It provides additional oversight, tracking, and control features that enhance reliability and operational transparency during batch execution. This post will provide an overview and introduction to the Batch Agent. I should start by saying that the SAS Container Runtime container images are Open Container Initiative (OCI) compliant images, so you have a lot of flexibility as to where and how you deploy and run the published images. The Batch Agent provides another option. The Batch Agent supplements the SAS Container Runtime batch API to process multi-transaction payloads. To dive straight into the details, the Batch Agent uses the Kubernetes Indexed Job pattern. The published Container Runtime image runs as a sidecar container to the Batch Agent. The following image provides an overview of the sidecar pattern applied to the Batch Agent implementation. Select any image to see a larger version. Mobile users: To view the images, select the "Full" version at the bottom of the page. This allows you to run the published model or decision as a batch job, rather than having the model running and calling it through the REST API. For example, as a pod or deployment running in the Kubernetes cluster. With SAS Viya 2025.09 there are now two options to execute batch payloads: SAS Container Runtime Batch REST API, and SAS Container Runtime Batch Agent. The Batch REST API allows single batch requests to be made to the SAS Container Runtime API. Whereas the Batch Agent coordinates the processing, allowing large batch input of data to be broken into manageable pieces and sent to the SAS Container Runtime Batch REST API for processing. This approach provides a flexible framework to run the published images in batch. As described earlier, the Batch Agent provides oversight, tracking, and control features that enhance reliability and operational transparency during batch execution. Two key features of using the Indexed Job pattern are: Control over the number of instances of the Batch Agent and SAS Container Runtime containers to create and run in parallel. The ability to define the SAS Container Runtime batch size (how many rows of data to send to the SAS Container Runtime container in one request). The following diagram provides an overview of the process flow for running a published model or decision using the Batch Agent. It is showing the high-level steps that must be performed. Supported Data Sources, Data Types and Databases With the initial release of the Batch Agent the following data sources, types and databases are supported: Google Cloud Storage: Blob storage This applies to both the input and output data. Data formats: Comma Separated Values (CSV) This applies to both the input and output data. Databases: Postgres A database is needed only if you want to track job processing. It is not required for the Batch Agent execution. Notes: The storage options may be enhanced in future releases, based on feature requests and SAS Product Suggestions from the user community. The requirements above are for the Batch Agent, it is not referring to the SAS Container Runtime support. Applying these requirements to the batch orchestration flow shown above means that Google Blob Storage must be used for steps A & E. Given the above context, the following architecture overview diagram provides a more detailed view of the Batch Agent runtime. Note, as stated above, the use of a job tracking database is optional. The use of the Prometheus Pushgateway and other monitoring tools is also optional. Configuring and Running the Batch Agent The batch agent is configured using a yaml document (manifest) to define the Indexed Job. Within the manifest you need to define the following: The Batch Agent image name. The environment variables required to configure the Batch Agent for this job instance. The credentials to access the Google Storage. The values (parameters and environment variables) required to run the SAS Container Runtime image. A template example for the above is provided in the SAS Container Runtime Help Center. See: About the SAS Container Runtime Batch Agent Once the manifest has been created, you start the batch job using the standard ‘kubectl apply’ command. This is step C in the orchestration flow shown above. The Batch Agent image name can be obtained using SAS Mirror Manager (mirrormgr). The following command should be used (see the SAS documentation): mirrormgr list remote docker tags --deployment-data order_certificates.zip --latest | grep sas-scr-batch-agent The Kubernetes cluster where the Batch Agent is running must be able to pull the Batch Agent image either from the SAS registry (cr.sas.com) or from a private registry. The private registry is where the published (SAS Container Runtime) images are stored, it must also be accessible. This is shown in the architecture overview diagram. This approach also means that the standard Kubernetes orchestration tools can be used to control the batch processing. For example, using a Git repository for the manifests and/or tools like Argo CD. When the batch job is run, the ‘SAS_SCR_BATCH_JOBS’ environment variable is used to specify the number of concurrent jobs (batch execution pods) that are used to process the batch file. When setting the environment variable to a value greater than 1, you must also set the spec.completions and spec.parallelism fields in the Indexed Job definition to the same value. When the ‘SAS_SCR_BATCH_JOBS’ environment variable is configured to a value greater than 1 it leads to multiple pods being started, each pod contains the Batch Agent and the SAS Container Runtime container for the model or decision being executed. This is illustrated in the following diagram. Here you can see “Model X” is being used. When splitting the batch input and running multiple ‘batch execution pods’ there are two key environment variables: SAS_SCR_BATCH_JOBS, and SAS_SCR_BATCH_INPUT_FILE_SPLIT. The SAS_SCR_BATCH_JOBS variable is used to specify the number of concurrent jobs that can be used to process the batch file. In the diagram you can see that this was set to ‘3’. The SAS_SCR_BATCH_INPUT_FILE_SPLIT variable specifies whether the input file has been split into multiple files. Finally, the SAS_SCR_BATCH_INPUT_FILE_NAME is a required variable, and as the name suggests, it is used to specify the input file name. You specify the file name without the file extension. The first two variables interact to provide the following runtime options: Option 1: One execution pod, with a single input file The SAS_SCR_BATCH_JOBS is set to ‘1’ or not set (the default value is ‘1’) and ‘SAS_SCR_BATCH_INPUT_FILE_SPLIT=FALSE’. In this case a single pod runs and processes a single input file. The input file name is provided by the SAS_SCR_BATCH_INPUT_FILE_NAME variable. Option 2: Multiple execution pods, with a single input file In this case the SAS_SCR_BATCH_JOBS variable is set to a value greater than ‘1’ and ‘SAS_SCR_BATCH_INPUT_FILE_SPLIT=FALSE’. With this combination, the Batch Agent (in each batch execution pod) will automatically split the input data (as evenly as possible) and send an indexed split to the SAS Container Runtime container (the model or decision). The recommended approach when using multiple execution pods is to manually split the input data into multiple files. As the Batch Agent doesn’t have the overhead of processing the input data split (and the user has control over the input data split). This is Option 3. Option 3: Multiple execution pods, with multiple input files The SAS_SCR_BATCH_JOBS variable is set to a value greater than ‘1’, and ‘SAS_SCR_BATCH_INPUT_FILE_SPLIT=TRUE’. In this scenario, the user has split the input data into multiple files. The number of input files must match the number set with SAS_SCR_BATCH_JOBS variable. For example, using our diagram above, if there are 3 ‘batch execution pods’, the Batch Agent uses the job index number to select the input file for each batch execution pod. Assuming the input file name variable was set to ‘batch_input’, you would provide the following input files: batch_input0.csv, batch_input1.csv and batch_input2.csv The job index numbering always starts at ‘0’. The input and output files In addition to the current requirement for Google Cloud Storage, there are further considerations regarding the input and output files. The Batch Agent assumes the following: All the column names in the assigned input file match the SAS Container Runtime input (request) variable names. All the column names in the assigned output file match the SAS Container Runtime output (response) variable names. If the input and output column names do not match the SAS Container Runtime variable names, it is possible to define a mapping file. The mapping file is in json format and maps the storage column names to the variable names. Retry Functionality and Job Restart To be able to cater for job processing errors, retry functionality is supported. When a connection request fails, the Batch Agent enters retry mode. By default, it attempts to connect three times, waiting ten seconds between each retry attempt. When a retry attempt succeeds, the batch job continues. There are two environment variables to control the retry functionality. You can specify the maximum number of retry attempts (SAS_SCR_BATCH_MAX_RETRY_ATTEMPTS), and the delay between each retry attempt (SAS_SCR_BATCH_RETRY_DELAY). To support job restart from a given point you must configure an external database. This is the “Job Tracking” database shown in the architecture overview diagram. When the “Job Tracking” database is configured, the Batch Agent restarts from where it left off, from the last know good point in the batch job. Observability Given the variable nature of batch processing, jobs can run in a matter of milliseconds or a job can run minutes or longer. Therefore, using a pull model for the job execution metrics will not work, in these cases a Pushgateway can be used. The Prometheus Pushgateway is an intermediary service which allows you to push metrics from jobs which cannot be scraped. This is shown in the architecture overview diagram above. The Batch Agent uses the push model, using a specified Pushgateway URL. The URL is configured by using the SAS_SCR_BATCH_PROMETHEUS_PUSHGATEWAY_URL environment variable. It is important to note the following when collecting job metrics: The Batch Agent job must have a unique name. All metrics are associated with only a single Batch Agent job run. When another job (with the same name and instance) starts, all metrics are reset. There is a range of metrics variables available to collect information on job duration, read and write counts, as well as item skip counts for both input and output. Once the metrics have been pushed to Prometheus, monitoring tools such as Grafana can be used. If the batch jobs are running in the same Kubernetes cluster as the SAS Viya platform and the SAS Viya Monitoring for Kubernetes framework has been deployed this can used for monitoring. However, it is important to understand that it is a customer responsibility to develop any required Grafana dashboards. In closing… While the SAS Container Batch REST API was available prior to SAS Viya 2025.09, the Batch Agent extends and enriches the batch process capability. Please refer to the SAS Container Runtime Help Center for the details, see: Executing and Managing Batch Jobs This is the first of a planned series of posts looking at the Batch Agent. Find more articles from SAS Global Enablement and Learning here.

MichaelGoddard · ‎09-11-2025

SAS Enterprise Session Monitor provides a wealth of information on a SAS Viya platform. It can also provide detailed information on a SAS SpeedyStore deployment. This post is a follow-on from my post on using the SingleStore Grafana dashboards, see: Using Grafana dashboards for monitoring SAS SpeedyStore In this post, we will revisit the topic of monitoring, or what is often called Observability, but this time looking at using SAS Enterprise Session Monitor (ESM) to monitor a SAS SpeedyStore deployment. As some context, SAS Enterprise Session Monitor isn’t designed to be a generalized reporting tool, like using Grafana dashboards. ESM is designed more as a Root Cause Analysis tool. It allows you to discover processes that are running in the SAS Viya platform and the Kubernetes nodes that are supporting the SAS Viya platform. While it is not a tool for historical reporting, it does allow the export of data, which can then be reported on using SAS Visual Analytics. See the SAS Enterprise Session Monitor documentation: Viewing Reports with Extracted Data in SAS Visual Analytics This function can be used to provide historical reporting. This could be to monitor infrastructure costs (it is possible to assign infrastructure costs in ESM), or to report on the users that are generating the most workload. For example, “Who are the top 10 SAS SpeedyStore users”. You could say that SAS Enterprise Session Monitor will help you with the “How, What and “Why”. For example, to answer questions like: What processes are running on the SingleStore nodes within my cluster? What processes are using all the resources? Why is my batch job running slowly? Let’s look at using SAS Enterprise Session Monitor to monitor SAS SpeedyStore. Observing the SAS SpeedyStore deployment ESM provides a suite of functions that allow you to look at and/or filter by workload type. The Dashboard that you see on login has a ‘Load by Type’ portlet. While this portlet will discover the SAS SpeedyStore processes by default, it is possible to configure the ESM Agent for additional workload types. To filter on specific pod types and define a custom name, the ‘esm-agents.yaml’ file has a set of regular expression statements that can be updated to meet your specific needs. The filters are defined under the ‘pod_types’ section. For my testing I wanted to refine the filters for the ‘sas-singlestore’ pods. Using a text editor, I updated the ‘esm-agents.yaml’ file with the following settings: pod_types: - pod_log_level: WARN pod_regex: .*-cluster-leaf.* pod_type: S2-Leaf - pod_log_level: WARN pod_regex: ^node.*-cluster.* pod_type: S2-Control - pod_log_level: WARN pod_regex: ^sas-singlestore-operator.* pod_type: S2-Op - pod_log_level: WARN pod_regex: ^sas-singlestore-osconfig.* pod_type: S2-daemon When defining the rules, it is important to understand that they are processed sequentially, from top to bottom. So, you need to define the most specific rules first in the list. This configuration allowed me to view the workload associated with the Leaf nodes separately from the Aggregator and Master nodes. The Aggregator and Master nodes have the type ‘S2-Control’ defined. At this point I should say that the default configuration would also allow this, it has the name S2C defined, but I wanted to update this name. I also wanted to add a definition for the operator and osconfig pods. That was the only customization that I performed to monitor SAS SpeedyStore. With the ESM Server and Agents in place, each of the Kubernetes nodes has an ESM Agent running on it. When a pod starts the ESM Agent “tags” the pod. If you already had the ESM Agent deployed and then updated the agent configuration any new or changed “tags” won’t take effect until a new pod starts. If the change was aimed at one of the SAS Viya platform services you need to stop and restart SAS Viya. If the change was targeting the ‘sas-singlestore’ pods, you can pick up the change by pausing and restarting the SingleStore database (cluster). For example, to pause the SingleStore cluster. kubectl -n namespace patch MemsqlCluster sas-singlestore-cluster --type=json -p '[{"op": "add", "path": "/spec/clusterState","value": "pause"}]' Then, to restart the cluster. kubectl -n namespace patch MemsqlCluster sas-singlestore-cluster --type=json -p '[{"op": "path": "/spec/clusterState","value": "pause"}]' With my updated configuration in place, I generated some workload using a batch job and using SAS Studio. Using SAS Enterprise Session Monitor In this post I won’t go into a lot of detail on using the ESM user interface, but I will pass on some hints and show a number of screenshots. When you login to SAS Enterprise Session Monitor you will see the main dashboard, it shows a summary view of the Kubernetes cluster where SAS Viya is running. In this example, I had run my load test, so you can see some activity. As part of the ESM Agent configuration you can set the cluster name, here I have set it to ‘SAS SpeedyStore’, as shown in the image. Select any image to see a larger version. Mobile users: To view the images, select the "Full" version at the bottom of the page. Looking at the “Load by User” and “Load by Type” portlets, you can see the various workloads. On the left, there is the “Load by User” portlet. As I was the only one using the platform it is a small list, it shows the ‘sasadm’ user that I used for running the batch workload. On the right you can see the “Load by Type” portlet. This shows my updated configuration using the S2-Control and S2-Leaf types. These names came from the ESM Agent configuration that I described above. The third workload type to highlight is “Batch”, this is from my load test, as I used the SAS Viya CLI to submit a batch job. The two charts show a representation of the percentage workload. Coming back to the first image, the two grey boxes shown in the “Jobs” portlet are my batch job (load-sashelp-tables). If you click on a job, it will open a “Timespans” report, this is shown in the image below. The first UI tip is to use the ‘Configure Grouping’ function. I found the ‘group by Pod ID’ a good option. This means that the processes that you are viewing are grouped by the pod where they are running, and not just a list of processes. You will see this in the images. Here the output was automatically filtered for me as I was viewing a single job. The “Process List” shows the SAS Batch Server pod name that is running the job. This is where it is also important to understand the nature of the workload that you are viewing. For example, is it a batch job, a SAS Studio session using the SAS Compute Server, or a CAS session. In this example, my load test was using the SAS Viya CLI to submit a batch job. Hence, it is using a SAS Batch Server. As stated above, you can see the pod name for this job on the left, in the Process List panel. You can also see the user that was running the job, the ‘sasadm’ user in this case. On the right you can see that we are viewing the breakdown for the ‘load-sashelp-tables’ job. Let’s say I was trying to understand why my batch job was running so slowly. This is where ESM comes into its own. The next screenshot is showing the compute node, the Kubernetes node: ‘aks-compute-90814698-vms000000’ This is where the batch job was running. By using the mouse to hover over the chart you can see the details for the job. In the image above you can see valuable information on: IO (Device Reads and Writes), CPU and memory usage. View statistics for a Node You can also drill into specific nodes. This can be done in a few ways. The first option is to select a node from the main Dashboard. This is shown in the next image. Here you can see that using the “Nodes” portlet, I have selected the compute node (aks-compute-90814698-vms000000). This provides a summary view for the compute node. The “Load by User” and “Load by Type” portlets are automatically filtered to show the details for the selected node. You can now see that the bulk of the workload was from the ‘sasadm’ user and was from the ‘Batch’ (SAS Batch Server) category. The next example is doing a deep dive on one of the singlestore nodes in my AKS cluster. This shows the processes running on node ‘aks-singlestore-62860369-vmss000002’. Using the category filters to just view the processes for the Leaf nodes (‘S2-Leaf’), you can see that the node has two leaf node pods running on it. On the bottom left you can see a process heatmap, and on the right you can see a view of the ‘IO Performance’. What about tracking the interactive usage? In my testing I also ran some workload from SAS Studio. This is illustrated in the following screenshots. The first is a Timespans view for the processes running on the Compute node. Coming back to my comment on understanding the session flow to help focus your investigation. When you start SAS Studio it spawns a SAS Compute Server session for you. This is why I needed to filter on Category ‘CMP’. This is shown in the tick-box on the bottom left of the image. You might notice that I also had ‘Batch’ selected. Here you can see one batch server session and two compute sessions in the Process List, based on this filtering. Again, zooming in on the node, you can see the IO Performance for the associated ‘compute node’, the node where the SAS Studio session is running. It is also possible to get the details for an individual session, the session level IO (not shown here). Looking at the first compute session in the Timespans view, you can see that this was for user ‘sastest1’. Looking at the details, you can see that I was running a PROC CASUTIL. This is shown in the top panel for the ‘sas.studio’ session. One of the advantages of using SAS SpeedyStore is that when accessing a SingleStore database from CAS, using srctype='singlestore' on the CASLIB datasource options, the SAS Embedded Process is automatically invoked to support computed columns and WHERE clause processing. ESM allowed me to get information on the SAS Embedded Process (EP) process (sasepmain) that is running on the Aggregator node. In this example, you can see that I have selected the ‘sasepmain’ process and can now see the IO Performance associated with that process. Again, you can see a breakdown of performance, this time for the Kubernetes node: aks-singlestore-62860369-vmss000002 Finally, the image above also shows another example of using the “group by pod ID” for the Process List. Conclusion As you can see, SAS Enterprise Session Monitor can provide very in-depth information on the SAS Viya platform, including a SAS SpeedyStore deployment. Hopefully, this has given you a glimpse into the potential for using SAS Enterprise Session Monitor to monitor SAS SpeedyStore. It was also very easy to set-up and deploy. I was up and running within minutes. I wouldn’t throw away the Grafana dashboards that I describe in my previous post, as I feel they are complementary. With SAS Enterprise Session Monitor being more a diagnostic tool, and the Grafana dashboards are more focused on trend and historical reporting. For example, in this last image, it shows the Grafana dashboard for disk usage (from when I was running my tests). For the testing I had 3 databases defined (MyDB, sales, and marketing). On the bottom left you can see the disk usage growing as I ran my tests. The metrics database is the Grafana monitoring database discussed in the last post. I hope this helps with your observability journey for monitoring SAS SpeedyStore. Find more articles from SAS Global Enablement and Learning here.

MichaelGoddard · ‎08-27-2025

@touwen_k I'm pleased you enjoyed the post. As far as Redis goes the SAS Viya platform only makes use of a subset of the Redis functions, the usage is limited to a temporary cache. I have not seen any examples or guidance of when you would need to change the default values. The best approach would be to start with the defaults and monitor usage.

MichaelGoddard · ‎08-24-2025

SingleStore have announced that they are depreciating SingleStore DB Studio as their monitoring solution and are moving to using Grafana dashboards. In this post we will look at implementing the Grafana dashboards for a SAS SpeedyStore deployment, using the SAS Viya Monitoring for Kubernetes Git project as the base. This post ended up being longer than expected, so if you don’t have time at the moment, but you would like to gain experience in implementing Grafana dashboards to monitor SAS SpeedyStore, skip to the end. TL;DR, see the SAS SpeedyStore workshop in learn.sas.com With that out of the way. Let’s have a look at implementing the Grafana dashboards. The SingleStore documentation supports SingleStore deployments in a variety of environments. Therefore, to implement their monitoring solution you need to make sure you identify the steps specifically for running SingleStore on Kubernetes. Then translate those instructions to make sense for the SAS SpeedyStore platform and using the SAS Viya Monitoring for Kubernetes project found in GitHub. So, let me piece together the steps for you… 😊 SingleStore’s native monitoring solution utilizes a SingleStore pipeline to pull the data from the exporter process on the Source Cluster and store it in a database named “metrics” found in the SingleStore “Metrics Cluster”. This is shown in the diagram below. Note that the metrics database can either reside within the same cluster as the Source Cluster, or within a dedicated cluster. Source: SingleStore documentation Select any image to see a larger version. Mobile users: To view the images, select the "Full" version at the bottom of the page. In our case, in this example of using monitoring with a SAS SpeedyStore deployment, the metrics database is kept in the SpeedyStore cluster. The first step is to deploy the SAS Viya Monitoring for Kubernetes framework; found on GitHub.com in the sassoftware repository of projects (viya4-monitoring-kubernetes). This will provide the monitoring framework and the Grafana instance that we will use for the SingleStore Grafana Dashboards. With this in place you are ready to configure the SingleStore monitoring. Configure the SingleStore pipeline and metrics database You have a few options to implement the pipeline, exporter and metrics database. You can manually create the configuration, or you can use the SingleStore Toolbox. The SingleStore Toolbox is used to deploy, administer, and manage a SingleStore cluster. However, most, or nearly all, the Toolbox functions are targeted at non-Kubernetes deployments. But there are a few functions that we can use for a SAS SpeedyStore deployment. One function is the sdb-admin start-monitoring-kube command which is used to configure and start the monitoring. It has a number of flags to control its operations. See the SingleStore documentation for more information: start-monitoring-kube A core part of the monitoring is the exporter process. Due to the required permissions, the exporter process is typically run as the SingleStore root user, but it is possible to run the process as another user. That user requires low level permissions to create and control the metrics database and pipelines. In the SingleStore diagram above, you can see that the export process needs to run on the Master Aggregator. Therefore, you need to target the Master node, the node-sas-singlestore-cluster-master-0, node/pod in a SAS SpeedyStore deployment. This is the “exporter-host” for the sdb-admin command. At a minimum, to configure and start the monitoring, including the metrics database, we need the following: sdb-admin start-monitoring-kube --cluster-name sas-singlestore-cluster --namespace {NS} --user root --password {root_PWD} --exporter-host {clusterMasterIP} Working through the command, the default name for the SingleStore cluster in a SAS SpeedyStore deployment is: sas-singlestore-cluster It is possible to change this name, so it is important to confirm the cluster name. For this you can use the SingleStore CLI. You would use the following command to get the cluster name: show global variables like 'cluster_name%'; As SingleStore runs in the SAS Viya namespace, the ‘namespace’ parameter is set to the SAS Viya namespace name. This example is showing running the exporter process using the SingleStore “root” user. Therefore, you need to get the password for the root user. You can use the following command to get the password: NS="viya_namespace_name" root_PWD=$(kubectl -n ${NS} get secret sas-singlestore-cluster -o yaml | grep "ROOT_PASSWORD"|awk '{print $2}'|base64 -d --wrap=0) Finally, you need to provide the fully-qualified host name or IP address for the exporter host. The name needs to be resolvable by the host running the sdb-admin command. This host also needs the KUBECONFIG for the Kubernetes cluster. It has to be able to access the cluster and the SAS Viya namespace. By default, the KUBECONFIG environment variable or the ~/.kube/config file are used to discover the cluster. Alternatively, the --config-file option can be used to specify the kube config. In the command example above, you can see that I’m passing the IP address for the Master node. For this you can use the following command: NS="viya_namespace_name" clusterMasterIP=$(kubectl -n ${NS} get pods -o wide | grep 'node-sas-singlestore-cluster-master-0' | awk '{print $6}') This gets the ClusterIP for the master node. After running the command, the exporter process, the pipeline and the metrics database are created. To confirm this, I used the SingleStore Studio. For example: Configure the “Monitoring user” To implement the Grafana dashboards, Grafana needs to connect to the metrics database. A database user is required, you need what I call the “monitoring user”. While the Grafana requirements are fairly simple in terms of database permissions, I would suggest that a good approach can be use the “monitoring user” more like an administrator. In this case it would need additional permissions. For example, you might grant the following permissions: GRANT CLUSTER, SHOW METADATA, SELECT, PROCESS ON *.* to '{S2MonitorUser}'@'%'; GRANT SELECT, CREATE, INSERT, UPDATE, DELETE, EXECUTE, INDEX, ALTER, DROP, CREATE DATABASE, LOCK TABLES, CREATE VIEW, SHOW VIEW, CREATE ROUTINE, CREATE PIPELINE, DROP PIPELINE, ALTER PIPELINE, START PIPELINE, SHOW PIPELINE ON metrics.* to '{S2MonitorUser}'@'%'; Alternatively, you might GRANT the “monitoring user” minimal permissions, and create a “monitoring administrator” user to manage the metrics database, pipelines and the exporter process. With the database user in place, the Grafana connection profile and dashboards can be created. Configuring Grafana One of the advantages of using the Grafana instance from the SAS Viya Monitoring for Kubernetes project is that the monitoring components are deployed into the Kubernetes cluster running SAS Viya. Because the SAS SpeedyStore offering is there, too, then this means that the dashboards for monitoring SAS Viya and SingleStore are all in the same place. Given that Grafana is running in the same cluster as the metrics database, the Grafana connection profile (the datasource definition for the metrics database) can use the internal Kubernetes DNS name for the DDL service. The Kubernetes internal DNS naming format is: service-name.namespace.svc.cluster.local Therefore, if my SAS Viya namespace was called “viya”, the DNS name for the DDL service is: svc-sas-singlestore-cluster-ddl.viya.svc.cluster.local The datasource definition can be created manually through the Grafana web application’s user interface or programmatically. For example, you could create a file called “speedystore-datasource.yaml” to define the datasource connection. apiVersion: 1 datasources: - name: monitoring type: mysql url: svc-sas-singlestore-cluster-ddl.viya.svc.cluster.local:3306 database: metrics user: {Monitoring_User} secureJsonData: password: {Monitoring_User_Password} isDefault: false version: 1 Note, the ‘isDefault’ value is set to false. This is because we do not want this new datasource to be the default datasource used by Grafana for other dashboards, we only want to use it with the SAS SpeedyStore dashboards. The datasource can be defined using a Kubernetes ConfigMap or secret. Given that the datasource definition has the database username and password, the best approach is to use a secret. The secret needs to be created in the namespace where Grafana is running. Here is an example of creating the secret from the yaml file. The Grafana namespace is called ‘v4mmon’ and the secret is called 'grafana-metrics-connection’. kubectl -n v4mmon create secret generic grafana-metrics-connection --from-file=$HOME/project/grafana/speedystore-datasource.yaml Once the secret has been created, you need to apply a label of “grafana_datasource=1” to the secret, this will trigger the automatic provisioning (Ioading) of the datasource by Grafana. For example: kubectl -n v4mmon label secret grafana-metrics-connection "grafana_datasource=1" Loading the SingleStore Grafana Dashboards You can download the cluster monitoring dashboards from SingleStore, using this link. You could use the Grafana user interface to manually load the SingleStore dashboards. However, the SAS Viya Monitoring for Kubernetes project provides a utility to load additional dashboards. The script called 'deploy_dashboards.sh'. Note, the dashboard files provided by SingleStore have file names that are mixed case and have spaces. To use the 'deploy_dashboards' script the file names must be lower-case alphanumeric without spaces. Also, the dashboards do not have SingleStore in the title, nor do they have any tags associated with them. I would recommend updating the dashboard names and adding a tag to the dashboards. In my testing I added two tags: ‘sas-speedystore’ and ‘singlestore’. Having the tags applied makes it easier to manage the dashboards. For example, the image below shows the dashboards loaded into Grafana. As you can see from the image, without adding ‘SAS SpeedyStore’ or ‘SingleStore’ to the titles of the dashboards, by default they have names like “Cluster View”, “Disk Usage” and “Memory Usage”. Therefore, it is hard to identify them as SingleStore dashboards. With the dashboards loaded, the configuration is complete. You are all set to monitor SAS SpeedyStore (SAS Viya and the SingleStore cluster). Running a load test With the configuration in place, I wanted to see if the dashboards were working, so I ran a few load tests. I used the SAS Viya batch CLI to create workload. The following image is from running this load test. The load test used the SASHELP tables. It loaded a subset into a SingleStore database and then generated three new “large” tables based on them. The throughput graphs show the peaks from running the load test. Conclusion While the SingleStore configuration can be created manually, using the SingleStore Toolbox definitely simplifies the setup. I also recommend taking the time to update the dashboard titles and to add the tags. Having the tags allows you to filter for the SingleStore dashboards. I wouldn’t totally get rid of SingleStore Studio for the moment as it still provides useful functions to view the databases and as a query tool (if you don’t have any other DB query tool). For my part, I was working in Azure. This post is based on my work to create a monitoring exercise for the SAS SpeedyStore workshop, that is available in learn.sas.com. Finally, if you would like to learn more about SAS SpeedyStore, and practice deploying SAS SpeedyStore in Azure, register for the workshop! SAS® SpeedyStore: Architect and Deploy the SAS® Viya® Platform with SingleStore Find more articles from SAS Global Enablement and Learning here.

MichaelGoddard · ‎08-12-2025

This is the second post on running SingleStore Studio in the Kubernetes cluster, within the SAS Viya namespace. In this post we will examine configuring TLS encryption for the SingleStore Studio (S2 Studio) application and will discuss the use of secure connections to the SingleStore (memsql) cluster. This builds on the configuration described in Part 1, see here. One of the advantages of deploying SingleStore Studio within the SAS Viya namespace is that you can make use of the SAS Viya secrets for the S2 Studio TLS configuration. TL;DR From a security perspective, the “cleanest” implementation is probably to run S2 Studio within the SAS Viya namespace with a full-stack TLS configuration, utilizing a connection profile using port 3306 and the internal DDL service name. Let’s examine this configuration and some of the architecture (deployment) considerations. Using the SAS Viya terminology, we will look at creating a “Front-door” TLS configuration and a “Full-stack” TLS configuration. See the SAS documentation: SAS Viya Platform Encryption Modes TLS configuration The TLS configuration can be addressed at two levels, at the ingress and within the S2 Studio configuration. Hence, the two configuration options. As described in Part 1, the ingress needs a host name that resolves to the pod running the S2 Studio application. In this example, I needed a DNS wildcard for: *.camel-a20280-rg.gelenable.sas.com ‘camel-a20280-rg.gelenable.sas.com’ was the DNS name for my ingress. Having the DNS wildcard would allow me to use a host name of: s2studio.camel-a20280-rg.gelenable.sas.com As stated in the introduction, I wanted to make use of the SAS Viya certificates. For the TLS configuration I wanted to use the ingress certificate, the sas-ingress-certificate secret. Therefore, this certificate must have a Subject Alternate Name (SAN) definition that covers my proposed host name. The TLS certificate must have a wildcard definition as shown above. To confirm if this was possible, I used the following command: kubectl -n namespace get secret sas-ingress-certificate -o jsonpath="{.data['tls\.crt']}" | base64 -d | openssl x509 -text The command decodes the secret which is base64 encoded. For my environment, it provided the following output. Select any image to see a larger version. Mobile users: To view the images, select the "Full" version at the bottom of the page. Here you can see that the Subject CN is: sas-viya-openssl-ingress-certificate. In the second highlighted block, you can see that the certificate has a number of SAN names defined. Most importantly it has a wildcard for: *.camel-a20280-rg.gelenable.sas.com To implement TLS for the S2 Studio application, the ingress definition for S2 Studio must contain a TLS (tls:) block. The ‘tls:’ block needs to specify the hostname or DNS alias for your environment and the secrets (certificate and key) to be used by the ingress. For example, my configuration needed the following definition: spec: tls: - hosts: - s2studio.camel-a20280-rg.gelenable.sas.com secretName: sas-ingress-certificate Now let’s look at creating the TLS configurations. “Front-door” TLS configuration The Front-door configuration refers to enabling TLS for the ingress only. The SingleStore Studio (S2 Studio) application is still using HTTP to receive communication from the Ingress. The configuration is illustrated below. In the diagram, the green line highlights the HTTPS session flow. The service and S2 Studio deployments remain as detailed in the Part 1 post. The following is an example of the Front-door ingress definition (using Kubernetes 1.27). # Front-door ingress definition --- apiVersion: networking.k8s.io/v1 kind: Ingress metadata: name: s2studio-front-door-ingress labels: app.kubernetes.io/name: s2studio-front-door-ingress spec: ingressClassName: nginx tls: - hosts: - s2studio.camel-a20280-rg.gelenable.sas.com secretName: sas-ingress-certificate rules: - host: s2studio.camel-a20280-rg.gelenable.sas.com http: paths: - path: / pathType: Prefix backend: service: name: s2studio-http-svc port: number: 80 Here you can see that the TLS block refers to the secret (TLS certificate and key) sas-ingres-certificate, and the backend service (s2studio-http-svc) is using (listening on) port 80. Here is the ‘s2studio-http-svc’ yaml definition. --- apiVersion: v1 kind: Service metadata: name: s2studio-http-svc labels: app.kubernetes.io/name: s2studio-http-svc spec: selector: app.kubernetes.io/name: singlestore-tools ports: - name: s2studio-http port: 80 protocol: TCP targetPort: 8080 type: ClusterIP Once again you can see that the service is mapping port 80 to port 8080 on the container. SingleStore Studio configuration With the Kubernetes configuration completed, let’s examine the S2 Studio configuration. Remember there are two configuration files, the ‘studio.hcl’ contains the profile definitions for the memsql clusters connections and the ‘singlestoredb-studio.hcl’ file which is used for the S2 Studio server configuration. In this example I used the following studio.hcl file. version = 1 cluster "ViyaS2Profile" { name = "SAS Viya DDL Connection" description = "Connection using port 3306" hostname = "svc-sas-singlestore-cluster-ddl" port = 3306 profile = "PRODUCTION" websocket = false websocketSSL = false kerberosAutologin = false } With the Front-door TLS configuration, the default singlestoredb-studio.hcl configuration can be used, with the S2 Studio application listening on port 8080. You can confirm the S2 Studio configuration using the following command: kubectl -n namespace logs pod_name You will see output similar to the following, showing that the S2 Studio server is using an HTTP configuration listening on port 8080. 2023/11/21 03:53:14 env.go:90 Log Opened 2023/11/21 03:53:14 server.go:74 Listening on 0.0.0.0:8080 2023/11/21 03:53:14 server.go:93 HTTPS configuration was not detected, serving with HTTP When we connect to the S2 Studio application we can now see that the ingress is using the Viya TLS certificate. Note, with the latest version of Chrome you no longer see the padlock to indicate that a session is secure. It is assumed that all sites use HTTPS. Viewing the certificate details, we can see that the session is using the sas-viya-openssl-ingress certificate. Now that we have seen the Front-door configuration, let’s look at the Full-stack configuration. “Full-stack” TLS configuration The Full-stack configuration refers to enabling TLS for the Ingress and the S2 Studio application. This configuration is illustrated below. For this configuration we must update the ingress, service and the S2 Studio deployment configuration. Therefore, this could be viewed as more complicated to implement than the Front-door configuration. Ingress and Service configuration We will start by looking at the ingress configuration. This is shown below, the configurations changes are highlighted. # Full-stack ingress definition --- apiVersion: networking.k8s.io/v1 kind: Ingress metadata: name: s2studio-ingress annotations: nginx.ingress.kubernetes.io/backend-protocol: HTTPS labels: app.kubernetes.io/name: s2studio-full-stack-ingress spec: ingressClassName: nginx tls: - hosts: - s2studio.camel-a20280-rg.gelenable.sas.com secretName: sas-ingress-certificate rules: - host: s2studio.camel-a20280-rg.gelenable.sas.com http: paths: - path: / pathType: Prefix backend: service: name: s2studio-https-svc port: number: 443 A key part of this configuration is the annotation to enable the TLS (HTTPS) connection to the backend service: nginx.ingress.kubernetes.io/backend-protocol: HTTPS This time you can see that the s2studio-https-svc service is being used, and it is listening on port 443. The service configuration is shown below. --- apiVersion: v1 kind: Service metadata: name: s2studio-https-svc labels: app.kubernetes.io/name: s2studio-https-svc spec: selector: app.kubernetes.io/name: singlestore-tools ports: - name: s2studio-https port: 443 protocol: TCP targetPort: 8443 type: ClusterIP Here you can see that the service is mapping port 443 to port 8443 on the S2 Studio container (pod). That completes the ingress and service configuration. The next step is to update the S2 Studio deployment for the TLS configuration. With the “Front-door” configuration this step isn’t required. Updating the S2 Studio deployment The S2 Studio deployment needs to be updated for the following: The S2 Studio application will use 8443 (this was exposed as part of creating the S2 Studio container image, as seen in Part 1). The S2 Studio configuration (singlestoredb-studio.hcl) needs to be updated to enable TLS (HTTPS). The deployment configuration is shown below. Again, the key changes are highlighted. --- # Create the deployment YAML file apiVersion: apps/v1 kind: Deployment metadata: labels: app.kubernetes.io/name: singlestore-tools workload.sas.com/class: singlestore name: singlestore-tools spec: replicas: 2 selector: matchLabels: app.kubernetes.io/name: singlestore-tools template: metadata: labels: app: singlestore-tools app.kubernetes.io/name: singlestore-tools workload.sas.com/class: singlestore spec: affinity: nodeAffinity: requiredDuringSchedulingIgnoredDuringExecution: nodeSelectorTerms: - matchExpressions: - key: kubernetes.azure.com/mode operator: NotIn values: - system - key: workload.sas.com/class operator: In values: - stateful containers: - image: myregistry.azurecr.io/singlestore-tools:latest imagePullPolicy: Always # IfNotPresent or Always name: s2tools resources: requests: # Minimum amount of resources requested cpu: 1 memory: 128Mi limits: # Maximum amount of resources requested cpu: 2 memory: 256Mi ports: - containerPort: 8443 # The container exposes this port name: https # Name the port "https" volumeMounts: - name: studio-files-volume mountPath: /tmp/s2studio-files - name: tls-volume mountPath: /tmp/viya-secrets lifecycle: postStart: exec: command: - /bin/sh - '-c' - | cp /tmp/s2studio-files/studio.hcl /var/lib/singlestoredb-studio/studio.hcl cp /tmp/s2studio-files/singlestoredb-studio.hcl /etc/singlestore/singlestoredb-studio.hcl cp /tmp/viya-secrets/tls.* /var/lib/singlestoredb-studio/ tolerations: - effect: NoSchedule key: workload.sas.com/class operator: Equal value: stateful volumes: - name: studio-files-volume configMap: name: studio-files - name: tls-volume secret: secretName: sas-ingress-certificate Here you can see that the container(s) has been configured to use port 8443. The TLS secrets (certificate and key) have been mounted as a volume called ‘tls-volume’; this is mapped to path ‘/tmp/viya-secrets’ within the container. The last part of the configuration I wish to highlight is under the ‘lifecycle:’ block. Here the singlestoredb-studio.hcl file is copied into the S2 Studio application configuration, and the secrets (Viya ingress certificate and key) are copied into the configuration. The copy statements are shown below: cp /tmp/s2studio-files/singlestoredb-studio.hcl /etc/singlestore/singlestoredb-studio.hcl cp /tmp/viya-secrets/tls.* /var/lib/singlestoredb-studio/ Updating the S2 Studio configuration To enable TLS for the S2 Studio application, the singlestoredb-studio.hcl file needs to be updated. The following changes are required: Define the port to be used for the TLS connection (in this example port 8443). Provide the path and file names for the TLS certificate and key files. Reading the SingleStore documentation it is possible to limit the TLS version and cipher suites being used, see here. Similarly, the SAS documentation lists the supported TLS Versions and Cipher Suites. This is also shown in the example below. The singlestoredb-studio.hcl configuration. # This is the IP address that SingleStore DB Studio will bind to. host = "0.0.0.0" # This is the port that SingleStore DB Studio will bind to. port = 8443 # This is the path to the state file of SingleStore DB Studio. statePath = "/var/lib/singlestoredb-studio/studio.hcl" # This is the path to the log file of SingleStore DB Studio. logPath = "/var/lib/singlestoredb-studio/studio.log" HTTPSCertificateFile = "/var/lib/singlestoredb-studio/tls.crt" HTTPSCertificateKeyFile = "/var/lib/singlestoredb-studio/tls.key" TLSVersions = ["TLSv1.2", "TLSv1.3"] TLSCipherSuites = ["TLS_AES_128_GCM_SHA256", "TLS_AES_256_GCM_SHA384", "TLS_CHACHA20_POLY1305_SHA256", "ECDHE-ECDSA-AES128-GCM-SHA256", "ECDHE-ECDSA-AES256-GCM-SHA384", "ECDHE-RSA-AES128-GCM-SHA256", "ECDHE-RSA-AES256-GCM-SHA384"] Note: The TLSCipherSuites property cannot be used when TLSv1.3 is selected for a TLS handshake. TLSv1.3 uses three predefined cipher suites that cannot be modified using the TLSCipherSuites property. The configuration is loaded using a ConfigMap. Again, I wasn’t using an inline configuration for the ConfigMap (as discussed in Part 1). Once the singlestoredb-studio.hcl file has been created the ConfigMap must be updated to include it. It is possible to load multiple files into a ConfigMap using the following command: kubectl -n namespace create configmap studio-files --from-file=./ For this configuration it loads the studio.hcl and singlestoredb-studio.hcl files. With the new deployment configuration and ConfigMap in place the S2 Studio application can be deployed. Once the singlestore-tools (S2 Studio) deployment has started, the TLS configuration can be confirmed by viewing the logs for one of the running S2 Studio pods. An example is show below. 2023/11/21 02:23:32 env.go:90 Log Opened 2023/11/21 02:23:32 env.go: 138 Minimum/Maximum TLS versions: TLSv1.2/TLSv1.3 2023/11/21 02:23:32 server.go:74 Listening on 0.0.0.0:8443 2023/11/21 02:23:32 server.go:80 HTTPS configuration detected, serving with HTTPS Here you can see that the TLS (HTTPS) configuration was detected and that port 8443 is being used. You can also see the TLS versions that are supported using this configuration. It has been limited to TLS v1.2 and v1.3. As I hadn’t updated the ‘studio.hcl’ file, at this point the S2 Studio home page looks the same as in the Front-door example shown above. I could probably finish the post here, but I want to take a moment to discuss using a secure connection to the SingleStore (memsql) cluster. Using a Secure connection to the SingleStore (memsql) cluster The S2 Studio application uses a WebSocket implementation for a secure connection to the memsql cluster. This is detailed in the SingleStore documentation and is discussed in the SAS manuals, see: SAS Help Center: Managing the SingleStore Cluster Reviewing the SAS documentation, you will see that the cluster side of the WebSocket Proxy configuration is performed automatically when you configure the SAS Viya platform for front-door or full-stack TLS encryption, you do not need to follow the steps in the SingleStore documentation. The SAS Viya deployment makes the following changes in the SingleStore cluster configuration: adminRequireSsl: true enableWebsockets: true Given the configuration changes made as part of the SAS Viya deployment, to create a connection profile for a secure connection (in the studio.hcl file), you must set the port to 443 and set the websocket and websocketSSL entries to true. This must be done by manually editing the profile, you can’t create the profile using the S2 Studio user interface. This is shown in the example studio.hcl configuration below. The second profile (SecureProfile) illustrates the secure connection configuration for my environment. Example studio.hcl file. version = 1 cluster "ViyaS2Profile" { name = "SAS Viya DDL Connection" description = "Connection using port 3306" hostname = "svc-sas-singlestore-cluster-ddl" port = 3306 profile = "PRODUCTION" websocket = false websocketSSL = false kerberosAutologin = false } cluster "SecureProfile" { name = "Secure DDL Connection" description = "Connection using port 443 and Public IP" hostname = "ddl-camel-a20280.gelenable.sas.com" port = 443 profile = "STAGING" websocket = true websocketSSL = true kerberosAutologin = false } At this point it is important to understand how the secure connection works. The secure connection requires two browser sessions, a connection to the S2 Studio application service and a direct connection from the browser to the WebSocket proxy of the SingleStore database cluster’s DDL service, to the SingleStore Master Aggregator (MemSQL MA). See the SingleStoreDB Studio WebSocket Proxy Documentation Source: SingleStore documentation (please ignore the port numbers shown in the diagram). Hence, the hostname must refer to the public IP address for the DDL service. In my case I had a DNS alias for the DDL service called: ‘ddl-camel-a20280.gelenable.sas.com’ The S2 Studio home page for this configuration is shown below. When using port 3306 to connect to the memsql cluster all communication is via the S2 Studio server application. This is shown above as the ‘SAS Viya DDL Connection’ profile. This is one of the key benefits of running S2 Studio within the Kubernetes cluster and within the SAS Viya namespace. All communication from the browser is via the ingress, and the backend communication to the memsql cluster is contained to within the Kubernetes cluster. There is no secondary browser connection to the backend. Therefore, from a security perspective, the “cleanest” implementation is probably to run S2 Studio within the SAS Viya namespace with a full-stack TLS configuration, utilising a connection profile using port 3306 and the internal DDL service name (svc-sas-singlestore-cluster-ddl). Conclusion Here we have looked at how to implement TLS for the S2 Studio application using the SAS Viya secrets. A benefit of this approach is that it piggybacks on the SAS Viya configuration, simplifying the S2 Studio configuration. While SingleStore don’t provide a container image to only run the S2 Studio application, I suggest that the effort to create the image is worthwhile and can provide a better overall environment for SAS Viya with SingleStore (this offering is now called SAS SpeedyStore). From a security perspective, running the S2 Studio application within the Kubernetes cluster allows the use of the DDL service with the internal IP address, thus keeping all the communication from the S2 Studio application to the memsql cluster within the Kubernetes cluster. Finally, as stated in Part 1, it is important to remember that the SingleStore Studio application is not maintained by SAS, and it is not shipped with the SAS SpeedyStore order. As such, SAS Technical Support do not support SingleStore Studio and will not provide support for this type of deployment. I hope this is helpful and thanks for reading… Find more articles from SAS Global Enablement and Learning here.

MichaelGoddard · ‎08-11-2025

In this post we will look at changing the default storage for the SAS Viya platform. By default, several of the SAS Viya services make use of the default storage class in Kubernetes. In most cases this will be fine. Also, having the Viya manifest refer to, or use, the default storage class is good when considering that the default SAS Viya platform configuration should be able to run on any Kubernetes cluster. But what happens if you need to change this? Let’s take a look at this. When running in the Cloud, typically the default storage class for the managed Kubernetes platform will not be the most performant option. For example, in Azure, the AKS platform uses Standard SSD storage, which is better than using HDD storage, but it is not as fast as the premium storage options. The following table is from the Microsoft Azure documentation. It provides a comparison of the disk types. Select any image to see a larger version. Mobile users: To view the images, select the "Full" version at the bottom of the page. While the Kubernetes documentation describes how to change the default storage class, see here. It is important to understand that in some cases this might not be possible, and/or not desirable to change. For example, perhaps SAS Viya is running in a shared Kubernetes cluster, with other business applications. In this case, you most likely wouldn’t want to change the default storage class, as this might affect other applications. There could be unintended consequences when deploying new applications and/or redeploying existing applications. There may also be restrictions in a Cloud managed platform. For example, the Azure documentation notes the following: “AKS reconciles the default storage classes and will overwrite any changes you make to those storage classes.” In AWS, with EKS 1.30, the default storage class is not set. See this post by Rob Collum: Amazon EKS 1.30 no longer assigns a default storage class Using Azure as an example, let’s take a look at changing the SAS Viya deployment. Running in Azure When you create the Azure Kubernetes Service (AKS) cluster, several storage classes are set up by default. This is shown in the image below. Figure 1: AKS Storage Classes In the image, all the storage classes using the provisioners ‘file.csi.azure.com’ and ‘disk.csi.azure.com’ were created when the AKS cluster was created. You can also see the ‘sas-nfs’ storage class, which is exposing the NFS Server storage for my SAS Viya platform. When I deployed the SAS Viya platform (using the “Deployment using Kubernetes Commands” method), you get the following default PVC bindings. Here you can see the PVCs that are using the default storage class. You can see that all the Stateful services are using the default storage class by default. Also, the internal Postgres server is using the default storage class. Note, when using the Deployment as Code (DaC) Git project you will get different PVC bindings. Let’s say I want to use better storage for Postgres and the Stateful services, say for RabbitMQ and Redis. I wanted to use one of the Premium storage options. Here I have two options: Use one of the existing storage classes, or Create a new storage class. For example, in Azure, to make use Premium V2 storage, you need to create a new storage class. The following is a sample definition, to create a storage class called ‘premium-v2-disk’. kind: StorageClass apiVersion: storage.k8s.io/v1 metadata: name: premium-v2-disk provisioner: disk.csi.azure.com volumeBindingMode: WaitForFirstConsumer reclaimPolicy: Delete allowVolumeExpansion: true parameters: cachingMode: None skuName: PremiumV2_LRS DiskIOPSReadWrite: "4000" DiskMBpsReadWrite: "1000" Reference: Enable Premium SSD v2 Disk support on Azure Kubernetes Service (AKS) - Azure Kubernetes Service In the above yaml, you can see the SKU being used is PremiumV2_LRS, and it is using the Azure CSI disk provisioner (disk.csi.azure.com). Once you have created the new storage class, it can be referenced in the SAS Viya configuration. For my testing, I updated the storage for Postgres and the core stateful services (Consul, RabbitMQ and Redis). To make the change to Crunchy Postgres there is an example patch transformer in ../sas-bases/examples/crunchydata/storage/, called crunchy-storage-transformer.yaml I updated the patch transformer to make use of my new (premium-v2-disk) storage class. You would make a change like this to improve Postgres performance. This is the most likely change you would make. But to demonstrate the ability to update the storage for Consul, RabbitMQ and Redis, you might also change the storage class being used to reference better storage. Using this Azure example, I wanted to use the managed-premium storage class. This storage class is shown in Figure 1 above. As an aside, the ‘managed-premium’ storage class is specified in the configuration for the SingleStore cluster with a SAS SpeedyStore deployment. Coming back to configuring the stateful services. Under the sas-bases/examples/rabbitmq/configuration folder, there is a patch transformer provided called: rabbitmq-modify-pvc-size.yaml Using this patch transformer you can also set the storage class to be used. Here is my configuration. --- apiVersion: builtin kind: PatchTransformer metadata: name: rabbitmq-modify-pvc-size patch: |- - op: replace path: /spec/volumeClaimTemplates/0/spec/resources/requests/storage value: 2Gi # This is the default - op: add path: /spec/volumeClaimTemplates/0/spec/storageClassName value: managed-premium target: group: apps kind: StatefulSet name: sas-rabbitmq-server version: v1 Note, as I didn’t really want to change the storage request, I could have removed the “- op: replace” section shown above. Turning my attention to the Consul and Redis configuration. At the time of originally writing this post there wasn't any Redis examples provided for this configuration in sas-bases. So, some investigation or “detective” work is required. However, with SAS Viya 2025.06 (June 2025) there are now example patch transformers to modify the PersistentVolumeClaim size or the StorageClass for the Redis nodes. Always start by looking at the provided examples, look for any existing patch transformers in the 'sas-bases/examples'. This will provide an example of how to target the required StatefulSet. In this case, there isn’t an existing patch transformer. So, the next place to look is in the sas-bases/base/components. Looking at the definitions for the component (in this case, Consul and Redis), you can get (or work out) the path to target. This is where having (seeing) the RabbitMQ example patch also helped out. As the current definition for Consul and Redis don’t specify a storage class, the patch needs to add this definition. For example: --- apiVersion: builtin kind: PatchTransformer metadata: name: update-redis-volume patch: |- - op: add path: /spec/volumeClaimTemplates/0/spec/storageClassName value: managed-premium target: group: apps kind: StatefulSet name: sas-redis-server version: v1 I also created a patch transformer to update Consul (sas-consul-server) as the target StatefulSet. It is essentially the same as the Redis patch above. The results… After updating the SAS Viya configuration. The new patch transformers need to be referenced in the kustomization.yaml. Then the site.yaml needs to be built and applied. Looking at the PVCs, you can now see that the new storage class is being used for Crunchy, and Consul, RabbitMQ and Redis are all now using the ‘managed-premium’ storage class. If you look carefully at the image, you can see that I didn’t update all the Crunchy configuration to use the new storage class. The sas-crunchy-platform-postgres-repo1 PVC is still using the default storage class. This is the pgBackRest PVC, which is used for backup and recovery. Also, the data-sas-opendistro-default-0 PVC is using the default storage class. Conclusion One of the nice things about SAS Viya is that making this type of configuration change for a new deployment is relatively easy. Here you have seen how to update the storage class being used for some of the stateful services. There are limitations with all storage types, some of which are shown in the table above from the Microsoft documentation. One thing to note with the Premium SSD V2 disk, is that, at the time of writing this post, there isn’t the option to use Zone-Redundant Storage (ZRS). The Microsoft documentation states the following “ZRS for managed disks is only supported with Premium SSD and Standard SSD managed disks”. In this post, the example deployment was using SAS Viya 2025.03. Thanks for reading and I hope this is helpful. Find more articles from SAS Global Enablement and Learning here.

MichaelGoddard · ‎08-11-2025

Following on from a recent post on “High Availability considerations for MAS on SAS Viya” I felt it would be good to look at the considerations for SAS Container Runtime. In this post I will look at using Availability Zones in the Microsoft Azure platform. As a SAS Container Runtime image is an OCI compliant container image, there are few constraints when it comes to running the Container Runtime images. Let’s look at some of the considerations… A good place to start, when thinking about high availability, is to understand the dependencies for running the container image. From a SAS Container Runtime perspective, we need to understand if there are any database dependencies (any database connections). If a SAS Container Runtime image does rely on connections to external databases, then we should also ask if that source is highly available. As the overall availability of a system is only as good as the weakest link. The least available resource within the process chain (the upstream or downstream systems). Another consideration is whether the SAS Container Runtime image and the database should be collocated. Often a database is in a separate security zone, but to minimise latency for real-time transactions, a good practice is to collocate the model with the data (if possible). This is important to understand when considering the use of Availability Zones. Now let’s look at some of the options for availability when running in Azure. For this we will look at running on the Azure Kubernetes Service (AKS) platform, but there are other options, such as Azure Container Instances (ACI). Both of which have options to provide redundancy (HA) for a deployment. When deploying to AKS, the simplest option is to do the following: Deploy the model or decision flow image using a Kubernetes deployment, and specify multiple replicas. Use multiple AKS nodes, by implementing “pod anti-affinity”. Consider dedicating a node pool (nodes) to running the Container Runtime pods. If Container Runtime pods are running on nodes with other workloads consider using Guaranteed Quality of Service. This provides redundancy in the runtime environment. But you could take this to the next level by using Availability Zones. While SAS Viya doesn’t officially support the use of Availability Zones, the SAS Container Runtime images are independent (self-contained) images, and we can deploy multiple instances to run in different Availability Zones. Many Azure regions provide availability zones, which are separated groups of data centers within a region. Each availability zone has independent power, cooling, and networking infrastructure. This is depicted in the following diagram from the Microsoft documentation. Also see the “Azure regions list” for information on which regions offer availability zones. Source: Microsoft Azure documentation Select any image to see a larger version. Mobile users: To view the images, select the "Full" version at the bottom of the page. A key aspect to understand is how many zones are available within a region, as not all regions have the same number of Availability Zones. Using multiple available zones For my testing I used the SAS Viya 4 Infrastructure as Code (IaC) for Microsoft Azure GitHub project to create the AKS cluster. The IaC only supports configuring multiple availability zones for the default (system) node pool. You do this using the default_nodepool_availability_zones parameter. You would also want to set the minimum number of nodes. For example, to use three zones. default_nodepool_min_nodes = 3 default_nodepool_availability_zones = [“1”, “2”, “3”] After the AKS cluster was built, I used the Azure Portal to add a new node pool, called “models” that used three availability zones. I could have also used the Azure CLI to do this, with the 'az aks nodepool add' command. You use the “- -zones” parameter to specify the zones to be used. It is a space separated list (- - zones 1 2 3). For example: az aks nodepool add --resource-group ${resource_group} --cluster-name ${cluster_name} \ --name models \ --node-vm-size Standard_D8s_v4 \ --os-type Linux \ --node-osdisk-size 200 \ --labels workload/class=models \ --node-taints workload/class=models:NoSchedule \ --enable-cluster-autoscaler \ --min-count 3 \ --max-count 3 \ --zones 1 2 3 After doing this, I had my Kubernetes cluster dedicated to running the model images. It had a “system” node pool and the “models” node pool. Using the ‘kubectl get nodes’ command, using the “-L” option allows you to display the zone that a node is running in. In Azure the nodes have the label: topology.kubernetes.io/zone For example. In the image you can see that I had three models nodes and three system nodes. The nodes are running in zones: eastus-1, eastus-2 and eastus-3. The models nodes also have a custom label and taint applied. Ingress Configuration To run the SAS Container Runtime pods you need an ingress. I configured the ingress controller to have 3 replicas. This gave me one ingress controller pod running on each node (due to the pod anti-affinity setting). For example. In the image you can see they are running on the system nodes. SAS Container Runtime deployment For this testing I was using the Model Manager Quick Start Tutorial examples, and I configured the ‘qstree1’ model to have 3 replicas. Providing a toleration for the models taint and configured to have pod anti-affinity. In the image you can see the pods are running on the models nodes. Testing the environment Now that I had an ingress controller and the qstree1 pods running in each available zone. It was time to test the configuration. I used CURL to test the configuration, to call (run) the qstree1 model. For this, I wanted to see which pods (and nodes) were being used. I used the ‘kubectl logs’ command, with the “-f” option to follow the log messages. Which was fine for the ingress controller pods, but I had to configure debug logging for the Container Runtime pods to see the curl session traffic. For SAS Container Runtime, the logging is configured using environment variables. As I wanted to see the curl session traffic, the best option is to configure the SAS_SCR_LOG_LEVEL_SCR_IO environment variable, and set it to “DEBUG”. For example, within the deployment manifest you need to add the following. env: - name: SAS_SCR_LOG_LEVEL_SCR_IO value: "DEBUG" This ensures that the cURL session traffic is written to the container log. I then used curl to run the model. The diagram below shows a summary of some of the session traffic and the pods/nodes that were used on each invocation of the model. Here you can see that the session traffic is using multiple Availability Zones. Finally, to show example log output. The following is the cURL command that I used for testing: curl --location --request POST 'http://'${INGRESS_FQDN}'/qs_tree1' --header 'Content-Type: application/json' --header 'Accept: application/json' --data ' { "inputs": [ { "name": "CLAGE", "value": 94.36666667 }, { "name": "DEBTINC", "value": 0 }, { "name": "DELINQ", "value": 0 }, { "name": "DEROG", "value": 0 }, { "name": "VALUE", "value": 39025 } ] } ' | jq Here is a sample of the ingress logs, showing the request to run the qs_tree1 model: The SAS Container Runtime log showed the following: The output from running the model is shown in the highlighted (red) text. Conclusion In this post we have looked at using available zones in Azure for the SAS Container Runtime images. As you have to create the manifest to run the Container Runtime pods, you have complete control over the deployment. Including the number of pod replicas, pod and node affinity, and the ability to make use of Availability Zones. Always remember to check the Azure documentation to confirm the Availability Zone support, and the number of zones that are available in the region. I hope that helps, and thanks for reading. Find more articles from SAS Global Enablement and Learning here.

MichaelGoddard · ‎08-11-2025

For SAS Viya 3.5 I wrote a series of posts on real-time integration and implementing High Availability (HA) for the SAS Micro Analytic Service (MAS). Now that SAS Viya is running on Kubernetes, I thought it was about time to revisit this topic. Running on Kubernetes presents some new options when it comes to high availability. But as always it is important to understand the requirements and the drivers for the system available targets. I say “system”, as when SAS Viya is supporting real-time business processing it is just one part of the business transaction. There may be upstream and downstream processing. Let’s look at some of the considerations… SAS Viya availability targets can’t be viewed in isolation. The Kubernetes cluster also needs to be capable of meeting the targets. Hence, it is important to understand the Kubernetes infrastructure and how resilient it is. On this thread, the capabilities available will vary depending on whether the Kubernetes platform is running on-premises or in the Cloud. The Cloud Providers typically offer features such as Availability Zones, which can be used to protect resources in the event of data centre outages. If a single Kubernetes cluster can’t meet the availability requirements, then multiple clusters might be required or making use of multiple availability zones. That is, the Recover Time Objectives (RTO) is driving the need for multiple Kubernetes clusters, and maybe even geographic/datacentre separation. This brings us back to using what I called a “Shared nothing” deployment. The shared nothing architecture pattern involves implementing multiple standalone SAS Viya platforms, with a load balancer frontend. If your SAS Viya platform is deployed to run on-premises instead of in the cloud, then the approach outlined in that post is still an option to consider. Using a single Kubernetes cluster and SAS Viya deployment Let’s assume that the SAS Viya platform is dedicated to running MAS and supporting the real-time transactions. You could use the HA patch transformer to implement HA (see here), but this could be overkill as it provides 2 pods for each stateless service. One of the nice things about running on Kubernetes is that it is possible to focus on individual services. So, it is possible to take a targeted approach to implementing HA. That is, just focus on MAS and the core Viya services that support MAS. The stateful services are deployed with redundancy by default, so you just need to focus on the following services: Logon (sas-logon-app) Identities (sas-identities) Authorization (sas-authorization) Credentials (sas-credentials) Configuration (sas-configuration) sas-file-store All the above services are controlled using a HorizontalPodAutoscaler (HPA). To get a complete list of the HPAs in the SAS Viya deployment you can use the following command. kubectl -n namespace get hpa It is possible to create a set of patch transformers to focus on these components, to update the HPA definitions. As part of creating the patch transformers, you need to determine the number of replicas required. I would suggest 2 or 3. I can’t really see the need to go past having 3 replicas. While it is just for Logon and Identities, the following is an example patch transformer, that sets the replicas to two. As the update is to address HA and not workload scalability, you can see that I have set the minReplicas and maxReplicas values both to two replicas. --- apiVersion: builtin kind: PatchTransformer metadata: name: sas-logon-ha patch: |- - op: replace path: /spec/maxReplicas value: 2 - op: replace path: /spec/minReplicas value: 2 target: kind: HorizontalPodAutoscaler version: v2 apps: autoscaling name: sas-logon-app --- apiVersion: builtin kind: PatchTransformer metadata: name: sas-identities-ha patch: |- - op: replace path: /spec/maxReplicas value: 2 - op: replace path: /spec/minReplicas value: 2 target: kind: HorizontalPodAutoscaler version: v2 apps: autoscaling name: sas-identities This shows the pattern to update the required services. MAS Configuration There are several things to consider when it comes to the MAS configuration, including: How many replicas to deploy Will nodes be dedicated to running the MAS pods Whether to implement Guaranteed Quality of Service. Typically, with real-time integration latency is very important. When the SAS Viya platform is integrated into a business process, it is key that the business transactions are not impacted by the SAS platform. So, while it might not take long for Kubernetes to start a new MAS pod (in human terms), this can still have an impact on the transactions. It is important to note that the MAS pod startup time will be governed by the number of published models. Each MAS pod is running all the published models. Therefore, having multiple MAS pod replicas is important. The MAS deployment is also controlled by a HorizontalPodAutoscaler definition. Therefore, the best approach to implement HA is to define a HPA patch. The following is an example to configure 3 replicas. --- apiVersion: builtin kind: PatchTransformer metadata: name: enable-mas-ha patch: |- - op: replace path: /spec/maxReplicas value: 3 - op: replace path: /spec/minReplicas value: 3 target: kind: HorizontalPodAutoscaler version: v2 apps: autoscaling name: sas-microanalytic-score Another consideration is the pod anti-affinity (podAntiAffinity) definition. This is set by default, using preferred scheduling. Therefore, assuming there are available nodes, Kubernetes should try and separate the running MAS pods. I say “should” as it is a “preferred” (not “required”) scheduling definition. The same is true for the other stateless services discussed above. The second consideration, dedicating nodes to running the MAS pods is definitely a way to isolate MAS from other workloads. But this does carry an overhead, it may not be the most efficient use of resources. Especially, if you need two or more nodes to provide available. The final consideration to discuss is whether to implement guaranteed Quality of Service (QoS). I have covered this in the post: Using Guaranteed QoS with SAS Viya To summarise that… Guaranteed QoS provides the highest level of service, and these pods will be scheduled to Kubernetes nodes with sufficient resources, and they are the last to get evicted when a Kubernetes node is under pressure. Therefore, this can be a good approach when the MAS pods are running on a node with other workloads. It provides a level of protection for the MAS pod(s), providing a good alternative to dedicating nodes. After updating the kustomization.yaml with the two patch transformers, the following example is from my test deployment running in Azure. Select any image to see a larger version. Mobile users: To view the images, select the "Full" version at the bottom of the page. Here you can see that I now have two replicas for the sas-identities and sas-logon-app pods. The sas-identities pods are running on two different stateless nodes (vmss000001 and vmss000002). The sas-logon-app pods are running on a stateful node and a stateless node. Finally, you can see the three MAS (sas-microanalytic-score) pods are all on different nodes. Conclusion Here you can see that it is a relatively simple process to create a SAS Viya deployment with a targeted HA implementation. This helps to minimize the deployment footprint. When running on one of the Cloud Providers Kubernetes platforms and using a targeted HA deployment as described here, the availability requirements could possibly be met without the need to implement a “Shared nothing” architecture. Assuming the models or decision flows are capable of being published as SAS Container Runtime images (see the SAS Container Runtime documentation: Model Score Code and Decision Object Support and Limitations), this makes the model or decision portable and removes the requirement to have a SAS Viya platform at runtime. This also opens up other options for HA. Perhaps that’s a topic for another day. Thanks for reading and I hope this is helpful… Michael Goddard Find more articles from SAS Global Enablement and Learning here.

MichaelGoddard · ‎08-11-2025

In this post I will explore the use of Kubernetes Guaranteed Quality of Service (QoS) with SAS Viya. We will look at the benefits of Guaranteed QoS, how do you implement it and some possible scenarios for using it with SAS Viya. I will also show configuration examples based on the proposed scenarios. Kubernetes uses the QoS classification (QoS class) to influence how different pods are handled. As the name suggests, what “quality of service” to give to a pod. Should a pod receive preferential treatment compared to other pods running in the cluster. The QoS is most helpful and used when a Kubernetes node is under resource pressure. Let’s start by understanding the basics of Kubernetes Quality of Service. What is Kubernetes Quality of Service? The Kubernetes documentation states that there are three classes of Quality of Service for pods. See Configure Quality of Service for Pods. The QoS classes are: BestEffort Burstable, and Guaranteed. Kubernetes uses the QoS classification (QoS classes) to influence how different pods are handled. The Kubernetes QoS classification is based on the resource requests of the containers in that Pod, along with how those requests relate to resource limits. The QoS classes are used by Kubernetes to decide which Pods to evict from a node experiencing Node Pressure. Node-pressure eviction is the process by which the kubelet proactively terminates pods to reclaim resources on a node. The kubelet monitors resources such as memory, disk space, and filesystem inodes on your Kubernetes cluster's hosts. When one or more of these resources reach specific consumption levels, the kubelet can proactively fail one or more pods on the host to reclaim resources and prevent starvation. But what does this mean for the three QoS classes? BestEfforty QoS As the name might suggest this provides the lowest level of service. Pods in the BestEffort QoS class can use node resources that aren't specifically assigned to Pods in other QoS classes. The kubelet prefers to evict BestEffort Pods if the node comes under resource pressure. Burstable QoS Pods that are Burstable have some lower-bound resource guarantees based on the pod requests, but do not require a specific limit. If a limit is not specified, it defaults to a limit equivalent to the capacity of the Node, which allows the Pods flexibly to increase their resources if resources are available. This is the default configuration for SAS Viya. In the event of Pod eviction due to node resource pressure, these Pods are evicted only after all BestEffort Pods are evicted. Guaranteed QoS Pods that are Guaranteed have the strictest resource limits and are least likely to face eviction. They are guaranteed not to be killed until they exceed their limits or there are no lower-priority Pods that can be preempted from the node. They may not acquire resources beyond their specified limits. These Pods can also make use of exclusive CPUs using the static CPU management policy. Hence, this is a good option to protect critical services. The downside of specifying Guaranteed QoS is that pods can end up in a pending state if no nodes are available with sufficient capacity to run the pod. Options to isolate or dedicate resources to a component Before I discuss the SAS Viya considerations in more detail, let’s consider the options to isolate or dedicate resources to a component. Using a dedicated node pool is possibly the obvious choice. Node pools are commonly used to meet the requirements specific to a component (set of pods). For example, the need for GPUs, additional memory and/or storage requirements. But a node pool could also be used to dedicate resources to a component. Though this might not be the most cost-effective approach. Higher levels of resource sharing helps to optimize the infrastructure (Kubernetes cluster) costs. This is where using Guaranteed QoS has a role. Rather than creating dedicated node pools, use a node pool for a variety of components. For example, the SAS Viya stateful and stateless services sharing a single node pool, but protect the critical services using Guaranteed QoS. SAS Viya Considerations As previously stated, the default configuration for SAS Viya is to use Burstable QoS. This is due to the default assets provided by SAS to create the site.yaml manifest. The site.yaml includes resource requests for containers that have different values to the resource limits, and/or has containers that only have a resource request specified (no limits have been specified). A logical place to start when considering the use of Guaranteed QoS is for critical services. For example, the stateful services or perhaps the SAS Micro Analytic Service (MAS) and SAS Event Stream Processing (ESP). However, the StatefulSets are all configured for high availability by default. As a result, they run with multiple pod replicas and could support a minimal level of pod evictions. Therefore, if you decide to create a single node pool for both stateful and stateless workloads, the risk of throttling or pod eviction during periods of heavy usage is unlikely to affect the stateful SAS service availability. But it is possible if too many pods of an individual stateful service are evicted at the same time. This is all true, so why might you consider changing the default configuration? Perhaps the SAS Viya platform has very high availability service levels (six or seven 9s as an availability target) and/or the platform is supporting real-time processing and low latency is extremely important. In these scenarios the use of Guaranteed QoS could be a solution without using dedicated node pools. Looking at the SAS Viya stateful services some likely candidates would be: SAS Configuration Server (Consul) SAS Infrastructure Data Server – when using the internal Postgres SAS Message Broker (RabbitMQ). The SAS Redis Service could be another candidate, but as the default deployment has six replicas of the sas-redis-server pods, I’m not sure this is worth the effort to change even with very stringent HA requirements. However, I think the main (or best) use case for Guaranteed QoS is for real-time integration using MAS or ESP. By default, both services are classified as stateless services and are deployed with the stateful and stateless services. They are not given any special treatment. The use of Guaranteed QoS could be considered for other components such as the SAS programming environment, the Compute Server, but due to the variable nature of the resource consumption this would not be very efficient. As you would need to set the requests/limits to cater for the largest job. This would lead to an over reservation of CPU and memory for most jobs. So (IMO), using a Burstable QoS does make sense for Compute. The How To implement Guaranteed QoS the container requests and limits must be set to the same values. But all the containers within a pod must have their requests and limits set and equal to each other for Kubernetes to classify the pod as guaranteed. This includes the init-containers. So, how to determine what the requests and limits should be is a good question. For my testing I inspected the default SAS Viya configuration. By this I mean, I created my Viya configuration, ran the kustomize build, then inspected the site.yaml file. Then used the default limits that were specified. This provided a starting point, but in most cases would need to be tuned for the workload. SAS Micro Analytic Service (MAS) configuration As previously stated, in my opinion the best candidate for Guaranteed QoS is MAS. As by default a single instance of the sas-microanalytic-score pod is deployed as a stateless service. When implementing Guaranteed QoS it is important to understand that each instance of the MAS pod (sas-microanalytic-score) is running ALL the models that have been published. Therefore, it is critical to understand the resource requirements. Given the higher resource requirements for the MAS pods (compared to the other services) it is more likely that a MAS pod could be evicted due to Node Pressure. As the default configuration provides a single instance of MAS, the pod eviction would affect the real-time transactions. Therefore, a good approach would be to implement at least 2 replicas of the sas-microanalytic-score pod and implement Guaranteed QoS for additional protection. The following provides two example patch transformers to update the MAS configuration. The first patch sets the QoS, with requests/limits of 4 CPUs and 2G of memory. As a comparison the default requests are set to cpu: 250m and memory: 750M. --- # Set Guaranteed QoS for MAS apiVersion: builtin kind: PatchTransformer metadata: name: set-mas-resources patch: |- - op: replace path: /spec/template/spec/containers/0/resources value: limits: cpu: 4000m memory: 2Gi requests: cpu: 4000m memory: 2Gi # The init container limits also have to be set to implement Guaranteed QoS # sas-start-sequencer - op: replace path: /spec/template/spec/initContainers/0/resources value: limits: cpu: 250m memory: 250Mi requests: cpu: 250m memory: 250Mi # sas-certframe - op: replace path: /spec/template/spec/initContainers/1/resources value: limits: cpu: 500m memory: 500Mi requests: cpu: 500m memory: 500Mi # sas-config-init - op: replace path: /spec/template/spec/initContainers/2/resources value: limits: cpu: 500m memory: 500Mi requests: cpu: 500m memory: 500Mi # sas-commonfiles-init - op: replace path: /spec/template/spec/initContainers/3/resources value: limits: cpu: 250m memory: 250Mi requests: cpu: 250m memory: 250Mi target: groups: apps kind: Deployment name: sas-microanalytic-score version: v1 In the code above, you can see that there are four initContainers: sas-start-sequencer sas-certframe sas-config-init sas-commonfiles-init The second patch is to set the pod replicas. This could be done a number of ways, but given that there is a HorizontalPodAutoscaler definition for MAS I choose to create patch for that. Here I have set the pod replicas to 3. --- apiVersion: builtin kind: PatchTransformer metadata: name: enable-ha-mas-replicas patch: |- - op: replace path: /spec/maxReplicas value: 3 - op: replace path: /spec/minReplicas value: 3 target: kind: HorizontalPodAutoscaler version: v2 apps: autoscaling name: sas-microanalytic-score After applying the two patches, in the images below you can see that I had three ‘sas-microanalytic-score’ pods that have a Guaranteed QoS classification. Select any image to see a larger version. Mobile users: To view the images, select the "Full" version at the bottom of the page. Using the kubectl client to view what nodes the pods are running on; you can see that due to the podAntiAffinity they are distributed across three nodes. podAntiAffinity is explained in the Kubernetes documentation: Assigning Pods to Nodes Updating the Stateful Services The following provides an example to update the Consul definition to set the cpu and memory requests to equal their associated limits so that Kubernetes will assign them Guaranteed QoS. Reviewing the patch, you will see that there are four initContainers that need to be updated: sas-start-sequencer sas-certframe sas-certframe-client-token-generator sas-certframe-management-token-generator --- # Set Guaranteed QoS for Consul apiVersion: builtin kind: PatchTransformer metadata: name: set-consul-resources patch: |- - op: replace path: /spec/template/spec/containers/0/resources value: limits: cpu: 1000m memory: 1Gi requests: cpu: 1000m memory: 1Gi # The init container limits also have to be set to implement Guaranteed QoS # sas-start-sequencer - op: replace path: /spec/template/spec/initContainers/0/resources value: limits: cpu: 250m memory: 250Mi requests: cpu: 250m memory: 250Mi # sas-certframe - op: replace path: /spec/template/spec/initContainers/1/resources value: limits: cpu: 500m memory: 500Mi requests: cpu: 500m memory: 500Mi # sas-certframe-client-token-generator - op: replace path: /spec/template/spec/initContainers/2/resources value: limits: cpu: 500m memory: 500Mi requests: cpu: 500m memory: 500Mi # sas-certframe-management-token-generator - op: replace path: /spec/template/spec/initContainers/3/resources value: limits: cpu: 500m memory: 500Mi requests: cpu: 500m memory: 500Mi target: groups: apps kind: StatefulSet name: sas-consul-server version: v1 Applying this patch you get the following result. You can now see that Consul is using Guaranteed QoS. RabbitMQ is also defined as a StatefulSet, there are three initContainers, being: sas-start-sequencer sas-certframe sas-certframe-token-generator Updating RabbitMQ you get the following. Finally, if you did want to update Redis, it is defined using a PodTemplate, it has only one initContainers definition, being sas-certframe. Conclusion Using Guaranteed QoS can be a useful approach for protecting a service, especially when sharing a node pool with other services. However, you do need to understand the SAS Viya deployment, as to implement Guaranteed QoS all containers within the pod need to be configured for Guaranteed QoS. I was working with SAS Viya Stable 2024.05, the initContainers shown in this post relate to Stable 2024.05. It is important to understand that this may change over time, so you shouldn’t assume the configuration will be the same when moving from one cadence version to another. Finally, you must also keep in mind that the cluster needs to have sufficient capacity otherwise you can end up with pods in a ‘pending’ state. This isn’t specific to using Guaranteed QoS, it will happen for any pod that can’t have its resource requests met. But it could be more likely to happen when using Guaranteed QoS. Please note, while this is a recent update to the SAS Communities site, it was written a while ago. There have been changes to the Redis Server configuration since first creating this post. Thanks for reading… Other resources There are several other posts related to this topic, also see the series from @RobCollum: Determine how many SAS Viya analytics pods can run on a Kubernetes node – part 1. Find more articles from SAS Global Enablement and Learning here.

MichaelGoddard · ‎10-13-2024

In this post we will look at running SingleStore Studio in the Kubernetes cluster, within the SAS Viya namespace. As some context, I’m talking about SAS with SingleStore deployments (orders). The SingleStore tools and SingleStore Studio are not shipped as part of the SAS order, and are usually installed on a machine external to the Kubernetes cluster. There are many benefits from running SingleStore Studio (S2 Studio) within the SAS Viya namespace. But there are also some challenges, namely SingleStore do not provide a standalone container image for deploying SingleStore Studio. Note, the SingleStore documentation also uses the term SingleStoreDB Studio. Here we will look at creating a container image to run the SingleStore Client (command-line) and S2 Studio, and deploying it to the SAS Viya namespace. I would like to start by saying that SingleStore do provide an image containing S2 Studio, it is in the ‘singlestore/cluster-in-a-box’ image. As the name suggests this image contains a complete environment, which is targeted at developers. SingleStore have several images on Docker Hub, see: https://hub.docker.com/u/singlestore. But they do not provide an image for just running S2 Studio, nor do SAS include this as part of the SAS Viya with SingleStore order. As some background, with a SAS with SingleStore order, all the SingleStore components, the memsql cluster runs within the Viya namespace. Why run SingleStore Studio within the Viya namespace? Let’s start by discussing the benefits of running S2 Studio on Kubernetes, within the Viya namespace. The key benefits of running S2 Studio in the Kubernetes cluster are simplified networking and security, as the S2 Studio server application is connecting directly to the SingleStore services running within the SAS Viya namespace. However, for a secure connection to the SingleStore cluster a WebSocket Proxy implementation is used. This means that a direct connection from the user’s browser to the backend is required. I will talk more about this in a follow-up post on enabling TLS security for the S2 Studio application. The SingleStore documentation states the following: “For situations where REQUIRE SSL is not mandatory, and if the additional configuration required to use a direct WebSocket connection becomes a bottleneck, it may be simpler to use the existing Studio architecture, where Studio is served over HTTPS and the singlestoredb-studio server is co-located with the Master Aggregator.” The REQUIRE SSL attribute is a memsql user setting. Therefore, running the singlestoredb-studio server within the Viya namespace effectivity collocates it with the memsql cluster, the Master Aggregator. The communication over port 3306 (which is unencrypted) is contained to within the Kubernetes cluster, and not exposed to the outside world. The SingleStoreDB Studio Architecture page also states that multiple S2 Studio instances can communicate with an individual cluster, you can easily scale out S2 Studio by creating new instances to manage user load. Hence, running S2 Studio as a Kubernetes deployment is another advantage of running it on Kubernetes, rather than being installed on a host machine outside of the K8s cluster. Building the container image To run S2 Studio on Kubernetes you first need to build a container image. For this you need to select an image that contains the base packages for S2 Studio to run. This became a process of research (looking at what SingleStore were using for their images) and trial and error. The Centos image works well and contains utilities like systemctl, but the image ends up being very large at over 600MB. In the end I settle on the almalinux/8-init as my base. The nice thing about this and the Centos image, is that it allowed for the standard install process for the SingleStore Client (CLI) and Studio, to build the container image. Remember, when selecting an OS image for the container build it is important to do the due diligence on the security of that image, can it be trusted. You must create your own Docker build file (Dockerfile), the following image shows my build file. As mentioned above, I decided to build an image that contained the SingleStore CLI and Studio. Select any image to see a larger version. Mobile users: To view the images, select the "Full" version at the bottom of the page. In the image, lines 3 to 6 install and update the required packages. Once that is in place the SingleStore Studio and CLI are installed (lines 8 to 13). Line 10 sets the permissions on the ‘singlestoredb-studio.hcl’ configuration file. This is required as the install runs under root, while the container will run as the memsql user (this is set on line 16). In lines 18 – 20 I added several labels for the image. Lines 22 and 23 show the ports that are exposed. Note, I could have also used ports 80 and 443. Finally, line 25 starts the S2 Studio server (application), specifies the command to run within the container. At this point I would like to acknowledge the assistance from Marc Price (Senior Principal Technical Support Engineer) in getting the Docker buildfile configuration finalised. The next step is to build the image from the Dockerfile. The following is an example build command: docker build --tag singlestore-tools --file singlestore-tools . Note, it is important to include the dot at the end of the command. This produced an image that was 479MB in size. Once the image has been built you can use the ‘docker history’ command to review the image layers. For example. Now that I had an image, I tested it by running it on the Docker server. For example: docker run -d -p 8080:8080 --name singlestore-tools singlestore-tools:latest Here you can see SingleStore Studio running on my Docker server. Once I was happy with the image, I tagged it and pushed it to my container registry. Deploying SingleStore Studio to the K8s cluster Now that you have an image, the next step is to create the deployment manifests. You need to create the configuration for deploying the S2 Studio application, along with a service and ingress definition. To pre-configure the ‘studio.hcl’ file a Kubernetes ConfigMap is also required. To deploy the S2 Studio application, it is possible to deploy it as a single pod or use a Kubernetes deployment to scale the S2 Studio deployment. In this example I will show how to use a K8s deployment for S2 Studio. An overview of the configuration is shown in the diagram below. A key decision is where should the S2 Studio application run? In this example, it is configured to run on the Stateful nodes, nodeAffinity for the Stateful nodes. But I could have also configured it to run in the singlestore node pool, as this is where the SingleStore Master Aggregator is running. With that decided, the next decision is how many replicas do you want to run, here I specified 2 replicas. I was testing in Microsoft Azure using an Azure Container Registry. --- # singleStore-tools deployment YAML apiVersion: apps/v1 kind: Deployment metadata: labels: app.kubernetes.io/name: singlestore-tools workload.sas.com/class: singlestore name: singlestore-tools spec: replicas: 2 selector: matchLabels: app.kubernetes.io/name: singlestore-tools template: metadata: labels: app: singlestore-tools app.kubernetes.io/name: singlestore-tools workload.sas.com/class: singlestore spec: affinity: nodeAffinity: requiredDuringSchedulingIgnoredDuringExecution: nodeSelectorTerms: - matchExpressions: - key: kubernetes.azure.com/mode operator: NotIn values: - system - key: workload.sas.com/class operator: In values: - stateful containers: - image: myregistry.azurecr.io/singlestore-tools:latest imagePullPolicy: Always # IfNotPresent or Always name: s2tools resources: requests: # Minimum amount of resources requested cpu: 1 memory: 128Mi limits: # Maximum amount of resources requested cpu: 2 memory: 256Mi ports: - containerPort: 8080 # The container exposes this port name: http # Name the port "http" volumeMounts: - name: studio-files-volume mountPath: /tmp/s2studio-files lifecycle: postStart: exec: command: - /bin/sh - '-c' - | cp /tmp/s2studio-files/studio.hcl /var/lib/singlestoredb-studio/studio.hcl tolerations: - effect: NoSchedule key: workload.sas.com/class operator: Equal value: stateful volumes: - name: studio-files-volume configMap: name: studio-files A consideration for creating the deployment manifest is that when a ConfigMap is mounted as a volume it becomes read-only. Therefore, you can’t directly mount the studio.hcl file into the target location (as the S2 Studio server requires read-write access to the studio.hcl file). Above you can see the ‘studio-files’ ConfigMap is mounted as the volume: ‘studio-files-volume’, with a mountPath of ‘/tmp/s2studio-files’. So, the configMap file(s) are loaded into a temporary location, then copied into the configuration. This is achieved with the following copy command: cp /tmp/s2studio-files/studio.hcl /var/lib/singlestoredb-studio/studio.hcl This copies my pre-configured cluster definition, studio.hcl file, into the Studio server configuration with the required permissions. Another consideration when deploying multiple replicas is whether to define Pod Affinity / AntiAffinity rules. For my test environment I defined a single node pool, called services, for the Viya stateful and stateless services. It had the stateful label and taint applied to the nodes. Below you can see that while I hadn’t defined any podAntiAffinity rules, I ended up with the S2 Studio pods (singlestore-tools) running on different nodes. Creating the Service and Ingress definitions To be able to access the S2 Studio application, a service and ingress definition is required. We will first look at the service definition. --- apiVersion: v1 kind: Service metadata: name: s2studio-http-svc labels: app.kubernetes.io/name: s2studio-http-svc spec: selector: app.kubernetes.io/name: singlestore-tools ports: - name: s2studio-http port: 80 protocol: TCP targetPort: 8080 type: ClusterIP Here you can see the service definition, the service was called s2studio-http-svc, and that I have mapped port 80 to port 8080 on the container(s). To access the S2 Studio application, I also needed a DNS name that would resolve for the S2 Studio application, the host name in the ingress definition. In my environment I had a DNS wildcard for: *.camel-a20280-rg.gelenable.sas.com Therefore, I used a host name of: s2studio.camel-a20280-rg.gelenable.sas.com --- apiVersion: networking.k8s.io/v1 kind: Ingress metadata: name: s2studio-ingress annotations: kubernetes.io/ingress.class: nginx labels: app.kubernetes.io/name: s2studio-ingress spec: rules: - host: s2studio.camel-a20280-rg.gelenable.sas.com http: paths: - path: / pathType: Prefix backend: service: name: s2studio-http-svc port: number: 80 Here you can see the ingress is targeting service: s2studio-http-svc Create the Studio Server configuration Given that the S2 Studio application is running in the Kubernetes cluster with SAS Viya it is possible to use the internal service name for the memsql cluster. The key advantage of using the service name is that it keeps the connection from the S2 Studio application to the memsql cluster internal to the K8s cluster. The service name is also a known value for a SAS Viya with SingleStore deployment, which means it is possible to pre-configure the studio.hcl file with a connection profile for the memsql cluster. The DDL service name is: svc-sas-singlestore-cluster-ddl The following was the ‘studio.hcl’ definition that I created. version = 1 cluster "ViyaS2Profile" { name = "SAS Viya DDL Connection" description = "Connection using port 3306" hostname = "svc-sas-singlestore-cluster-ddl" port = 3306 profile = "DEVELOPMENT" websocket = false websocketSSL = false kerberosAutologin = false } Once the file has been created, the following command can be used to create the ConfigMap. kubectl -n namespace create configmap configmap_name --from-file=file_name Note, it would have been possible to create an inline definition for the studio.hcl file in the S2 Studio deployment yaml. However, I prefer to keep this separate as it provides more flexibility and makes it easier to load (define) multiple files. We will see this in Part 2 of this post. In my opinion it also makes it easier to create the files, as you don’t have to worry about yaml indentation. You just create the files as required. The only consideration for this approach is that the ConfigMap must be in place prior to applying the deployment for the S2 Studio application. The Results… With the above configuration in place, you are set to start using SingleStore Studio. Below you can see the SingleStore Studio home page with the pre-configured cluster definition. To review the configuration, the studio.hcl file has a pre-configured profile and the S2 Studio pods connect to the SingleStore Master Aggregator on port 3306 using the DDL service (svc-sas-singlestore-cluster-ddl). Conclusion Here we have looked at how to create a container image for the SingleStore Client and Studio. The configuration shown is using HTTP to connect to S2 Studio. In Part 2 I will show how to implement TLS using the SAS Viya secrets. Finally, it is important to remember that the SingleStore Studio application is not maintained by SAS, and it is not shipped with the SAS Viya with SingleStore order. As such, SAS Technical Support will not provide support for this type of deployment. Thanks for reading… Michael Goddard Find more articles from SAS Global Enablement and Learning here.

MichaelGoddard · ‎06-09-2024

This is Part 2 of the post on running SAS Viya on a shared Kubernetes cluster. In Part 1 I discussed some of the challenges that can be encountered when deploying SAS Viya in a cluster that has untainted nodes, as well as the main deployment considerations. In this post I will discuss implementing required scheduling, required nodeAffinity, to force a desired topology when there are untainted nodes in the cluster. Again, I will share some of the tests that I ran to help illustrate the issues. To recap, for my testing I created an Azure Kubernetes Service (AKS) cluster with the required SAS Viya node pools, with the labels and taints applied, but they were scaled to zero. In addition to the four node pools that correspond to the standard SAS Viya workload classes (cas, compute, stateful, and stateless) and a ‘system’ node pool for the Kubernetes control plane, I also created an ‘apps’ node pool (without any taints applied). This was to simulate a scenario where there are other applications running and having several nodes available that didn’t have any taints applied. Using required scheduling to force a topology To ensure that the SAS Viya pods run on the desired nodes (when there are untainted nodes), ‘required node affinity’ must be used. I will start by saying the easiest path, the simplest configuration option when running with un-tainted nodes is to just focus on configuring the CAS and Compute pods. This is easier than reconfiguring all the stateless and stateful service to use required scheduling. There is also a patch transformer for CAS supplied in the overlays: require-cas-label.yaml See the SAS Viya README: Optional CAS Server Placement Configuration This means that you only need to focus on the sas-programming-environment components (pods). I discuss enabling required scheduling in the following Post : Creating custom Viya topologies – Part 2 (using custom node pools for the compute pods). For this test I configured required scheduling for both CAS and Compute, reset the cas and compute node pools to zero nodes, then deployed SAS Viya. I created the following patch transformers for the sas-programming-environment pods: sas-compute-job-config-require-compute-label.yaml sas-batch-pod-template-require-compute-label.yaml sas-launcher-job-config-require-compute-label.yaml sas-connect-pod-template-require-compute-label.yaml To summarize, the configuration the following patch update is applied to update the nodeAffinity for the Compute components: patch: |- - op: remove path: /template/spec/affinity/nodeAffinity/preferredDuringSchedulingIgnoredDuringExecution - op: add path: /template/spec/affinity/nodeAffinity/requiredDuringSchedulingIgnoredDuringExecution/nodeSelectorTerms/0/matchExpressions/- value: key: workload.sas.com/class operator: In values: - compute While this worked, it did highlight the problem for the first user to login to SAS Studio, as it is at this point that the first compute node is created. The SAS Studio session timed out waiting for the Compute Server (pod) to download the container images and then start. I waited for the compute node to fully start then started a new SAS Studio session. This time I got the Compute Server context. This does beg the question: Should you ever let the compute node pool scale to zero? Looking at the SAS Viya deployment I still didn’t have any pods running on the stateful or stateless nodes. These node pools were still scaled to zero, all the stateless and stateful pods were running on the ‘apps’ nodes. But would starting some stateless and stateful nodes prior to the SAS Viya deployment fix this? At this point I did another test; I manually scaled the stateful node pool to have one node. Before the SAS Viya deployment the AKS cluster had the following nodes. Select any image to see a larger version. Mobile users: To view the images, select the "Full" version at the bottom of the page. You can see that I had three nodes available in the SAS Viya node pools and again I had several ‘apps’ nodes (aks-apps-xxxx-vmssnnnn) available to simulate nodes being used for other applications. I did this to illustrate the Kubernetes pod scheduling behaviour. Kubernetes (the cluster autoscaler) will not scale a node pool util it is needed, regardless of the node affinity for the pod, if an “acceptable” node is available, in this case one without any taints, the pod will be scheduled there as a first choice. Hence, at the end of this deployment I still only had one stateful node and no stateless nodes. Once SAS Viya had deployed, I used the following command to view the pods running on the stateful node (aks-stateful-12471038-vmss000000) and found there were only 8 pods running on the node. kubectl -n namespace get pods --field-selector spec.nodeName=node-name The rest of the Viya stateful and stateless pods were running on the 'apps' nodes. Using one node pool dedicated to SAS Viya In this scenario perhaps the organisation only wants to dedicate, or reserve, only one node pool to SAS Viya. When sharing a cluster with other applications this is probably the simplest approach to ensure that there are nodes available to meet the requirements for the SAS Viya Compute Server and CAS functions. I like this configuration as you only need to create one additional node pool in an existing cluster. Then using required scheduling for the CAS and Compute pods it also allows for scaling the node pool to zero. The problem with scaling the node pools to zero is avoided as when the CAS server starts this will trigger the initial scaling of the shared node pool. For this test I created a ‘viya’ node pool, with the compute label and taint applied. To avoid pod drift, the sas-programming-environment pods should be updated to use required scheduling using the standard compute labels and taints, as described above. Additionally, as the CAS and Compute pods are sharing the same node pool, the CAS server configuration had to be updated to target the shared node pool, the ‘viya’ node pool. For this I used the provided CAS overlay as a template and targeted the workload.sas.com/class=compute label. The tolerations also had to be updated for the compute taint. The patch transformers for the CAS configuration are shown below. The first patch transformer updates the CASDeployment. # PatchTransformer to make the compute label required and provide a toleration for the compute taint --- apiVersion: builtin kind: PatchTransformer metadata: name: run-cas-on-compute-nodes patch: |- # Remove existing nodeAffinity - op: remove path: /spec/controllerTemplate/spec/affinity/nodeAffinity/preferredDuringSchedulingIgnoredDuringExecution # Add new nodeAffinity - op: add path: /spec/controllerTemplate/spec/affinity/nodeAffinity/requiredDuringSchedulingIgnoredDuringExecution/nodeSelectorTerms/0/matchExpressions/- value: key: workload.sas.com/class operator: In values: - compute # Set tolerations - op: replace path: /spec/controllerTemplate/spec/tolerations value: - effect: NoSchedule key: workload.sas.com/class operator: Equal value: compute target: group: viya.sas.com kind: CASDeployment name: .* version: v1alpha1 The second patch transformer updates the 'sas-cas-pod-template'. --- apiVersion: builtin kind: PatchTransformer metadata: name: set-cas-pod-template-tolerations patch: |- - op: replace path: /template/spec/tolerations value: - effect: NoSchedule key: workload.sas.com/class operator: Equal value: compute target: kind: PodTemplate version: v1 name: sas-cas-pod-template The final update I made was to adjust the CPU and memory requests and limits for CAS. This was to ensure that there was space available on the shared nodes for the Compute Server pods. For example, I used nodes with 16vCPU and 128GB memory, then set the CAS limits to 12 vCPU and 96GB memory. I also configured the requests and limits the same to enforce guaranteed QoS. Using guaranteed QoS is important to protect the CAS Server pods when the nodes are busy. It ensures that the CAS pods are among the last to be evicted from a node. This was my target topology. To configure the CPU and memory for CAS see the example in sas-bases: ../sas-bases/examples/cas/configure/cas-manage-cpu-and-memory.yaml When I deployed SAS Viya with a MPP CAS Server, this had the benefit of ensuring that multiple nodes were available for the Compute Server pods. The Compute pods were configured with the resource defaults (CPU and memory requests and limits). As a side note (running SAS Viya 2024.03), if you inspect the sas-compute-server pod you will see two running containers. The sas-programming-environment container has resource limits of 2 cpu and 2Gi memory, and the sas-process-exporter container has limits of 2 cpu and 4Gi memory. Once SAS Viya was running, I then started multiple SAS Studio sessions. In the image below you can see that the Compute Server pods are running on multiple ‘viya’ nodes. As an end user life was good, as my SAS Studio sessions started without any failures. 😊 Conclusion Sharing the cluster with other applications is possible, but this needs to be carefully planned to ensure the best result is achieved for ALL applications. As always, it’s important to focus on the requirements for the SAS Viya platform. When the cluster has untainted nodes, you should configure required scheduling to ensure you have an operational SAS Viya platform. The simplest approach is probably just to focus on Compute and CAS. A key question is: Do the untainted nodes meet the system requirements for SAS Viya? If not, additional node pools WILL be required to run the SAS Viya platform. Here I have proposed the concept of sharing a node pool for Compute and CAS, and shown how you could reserve some capacity for the two workloads. You should do some capacity planning and sizing to establish suitable node sizes and the appropriate resource reservations. But keep in mind, it is possible to have a shared node pool for CAS and Compute and still use CAS auto-resources to effectively dedicate some nodes within the node pool to running CAS. Finally, I have left you with a couple of questions ponder (from an end-user perspective): Should you ever let the compute node pool scale to zero? How many nodes should you have available for the Compute pods to avoid SAS Studio timeouts? I’m sure this will lead to interesting discussions! Maybe it’s a topic for another day…

MichaelGoddard · ‎05-13-2024

Hi @EyalGonen The key reasons for possible incompatibilities are around the system requirements and there could be changes introduced to the cluster-wide resources that would break the older Viya deployments. Therefore, when sharing a cluster for multiple SAS Viya deployments it is important to ensure that they can coexist. For those reasons I also wouldn't recommend collocating Stable cadence and LTS cadence versions on the same cluster. As that would / may increase the risk of incompatibilities. I hope that helps.

MichaelGoddard · ‎04-15-2024

When we think about running SAS Viya in a shared Kubernetes cluster there are two use cases: Using a dedicated Kubernetes cluster for SAS Viya, running multiple SAS Viya deployments. Running SAS Viya in a Kubernetes cluster that is shared with other (3rd party) applications. With the withdrawal of support for application multi-tenancy within SAS Viya, as of Stable 2023.10 and LTS 2023.10, it means that the requirement for running multiple SAS Viya environments in a shared cluster may become more common. I will start by stating, it is possible to run SAS Viya in a shared cluster, in both cases, but there are several considerations. In this post I will focus on running SAS Viya in a Kubernetes cluster shared with other applications. When using a Kubernetes (K8s) cluster for multiple applications it is important to understand all the application workloads, their individual system requirements and the service levels associated with the applications. These factors impact the K8s cluster design including the design for the node pools and workload placement. Complications often arise when an organisation doesn’t want to dedicate nodes to applications and does not want to taint nodes. Applications such as SAS Viya have some specific system requirements, for example: the need for local storage on the nodes for SASWORK and CAS disk cache the in-memory processing requirements for the CAS server a requirement to use GPUs. This is not common for other business applications. SAS Viya also has a need for specific node taints and labels. Here I’m particularly thinking about the sas-programming-environment (Compute Server) pods and CAS. Functions like the SAS Workload Orchestration and the sas-programming pre-pull function rely on having nodes labelled with: workload.sas.com/class=compute Remember, in Kubernetes, node taints are used to repel pods and node labels are used to attract pods. These are used in conjunction with the node affinity and toleration definitions in the application pods. See the Kubernetes documentation: Assigning Pods to Nodes As the SAS Viya configuration defaults to using preferred scheduling, appropriately labelling and tainting nodes is important to achieving the desired workload placement for a (specific) topology. The node pools are a way to optimise the deployment for specific application requirements (like GPUs and storage) and workloads. Another consideration when sharing the cluster with other applications is the level of permissions required to deploy and run SAS Viya. This can be a concern for some organisations. For example, the need for cluster-wide admin permissions. SAS Viya also has specific system requirements for functions such as the ingress controller, these might be different to, or in conflict with, the application ecosystem that is currently in place. Hence, the system requirements and admin permissions for SAS Viya can be key drivers for dedicating a cluster to running SAS Viya. With that said, let’s now look at some deployment scenarios. I felt the easiest way to illustrate some of the deployment concerns was to run some tests. Running in a cluster with untainted nodes As the default SAS Viya deployment uses preferred scheduling, one of the key concerns to achieving a target topology is what I call “pod drift”. Pod drift is where the Viya pods end up running on nodes that aren’t designated for the SAS Viya processing. Therefore, you are not achieving the target workload placement for the desired topology for SAS Viya. For most components this isn’t a problem, but for functions that have specific node requirements this can lead to poor performance and/or resource utilization, or worse, it might even break the Viya deployment in some way! To illustrate this, I tested the following scenario when running SAS Viya in a cluster with untainted nodes: The organisation doesn’t taint nodes but has agreed to label and taint some nodes for SAS Viya. The SAS Viya node pools have been defined to allow them to scale to zero, and Dedicating a single node pool to SAS Viya. Test 1 In my first test I created an AKS cluster with the required SAS Viya node pools, with the labels and taints applied, but they were scaled to zero. In addition to four node pools that correspond to the standard SAS Viya workload classes (cas, compute, stateful, stateless) and a ‘system’ node pool for the Kubernetes control plane, we also have the ‘apps’ node pool (without any taints applied). This was to simulate a scenario where there are other applications running and several nodes available that didn’t have any taints applied. At the start of the SAS Viya deployment the state of the cluster is shown in the image below. Select any image to see a larger version. Mobile users: To view the images, select the "Full" version at the bottom of the page. The ‘apps’ node pool had 6 nodes running and the system node pool had one node. When I deployed SAS Viya the only node pool that was used was the apps node pool. Even with the CAS auto-resources enabled the Kubernetes scheduler selected to start a new node in the apps node pool rather than using the cas node pool. This can be seen in the image below. In the image you can see that sas-cas-server-default-controller is running on the seventh node (aks-apps-37136449-vmss000006) in the 'apps' node pool. When I started a SAS Studio session, I couldn’t get a compute server as there were no compute nodes available. This is shown below. In the image, you can see that I failed to get a SAS Studio compute context, and the SAS Compute Service and the SAS Launcher pods are both running in the apps node pool, but there isn’t any SAS Compute Server pod. There were no nodes available with the compute label: workload.sas.com/class=compute At this point I should state that I had the SAS Workload Orchestrator (SWO) enabled, I was using the default configuration, as there weren’t any nodes available with the workload.sas.com/class=compute label the Compute Server for the SAS Studio session was not started. Obviously, this isn’t a good experience for the SAS users. It also highlights that it is critical to have node(s) available with the compute label when using the SAS Workload Orchestrator, and what happens when the cluster autoscaler doesn’t trigger the scaling of the compute node pool. So, what would happen if there were cas and compute nodes available? Test 2 In test 2 I manually scaled the cas and compute node pools to have one node each. The image below shows the state of the cluster when I started the test. The hope here was that I would get the CAS server and Compute Server pods running on the desired nodes. This deployment was better as the Kubernetes scheduler selected to run pods on the cas and compute nodes. The CAS Server (sas-cas-server-default-controller) was running on the cas node, and I was able to get a compute server when I started SAS Studio. This is shown in the following image. Here you can see that the sas-compute-server pod is running on the compute node, and the SAS Compute Service and SAS Launcher pods are running in the ‘apps’ node pool. This still wasn’t perfect as in this test I still didn’t get any pods running on the Stateless or Stateful nodes. I would have had to have nodes available in the stateless and stateful node pools for this to happen. Another problem with this Kubernetes configuration is when SWO is disabled, there is no guarantee that all the sas-compute-server pods will be running in the compute node pool. Let’s assume that there is a significant programming workload and there isn’t sufficient capacity on the existing compute server node(s). Rather than starting a new compute node the K8s scheduler could select one of the existing apps nodes to run the new sessions. At this point the users would start to see the error shown above in the first test. The timeout in SAS Studio waiting for the Compute Server pod (the SAS Studio compute context) to start. This is due to the time required to pull the sas-programming-environment image down to a node. A user could strike it lucky, and the Compute Server could be scheduled onto a node that already has the sas-programming-environment image. But the Compute Server still has dependencies on things like storage for SASWORK and maybe the need for GPUs. This illustrates the problem of using preferred scheduling in a cluster with untainted nodes. You can probably live with the stateless and stateful pods running on any available node in the cluster, but how do you ensure that the compute and cas pods are running on the target nodes even when the node pools scale to zero? To ensure that the Viya pods run on the desired nodes (when there are untainted nodes) ‘required node affinity’ (required scheduling) must be used. I will discuss this in more detail in part 2. Conclusion Sharing the cluster with other applications is possible, but this needs to be carefully planned to ensure the best result is achieved for ALL applications. The SAS Viya system requirements and admin permissions need to be discussed and can be key drivers for dedicating a cluster to running SAS Viya. As we have seen, there are many factors that affect the Kubernetes scheduling, including node availability, the labels and taints on nodes, as well as the application configuration (in this case SAS Viya), to name a few. In part 2 we will look at using required scheduling to force a topology (when there are untainted nodes) and dedicating a single node pool to SAS Viya within the shared cluster. Find more articles from SAS Global Enablement and Learning here.

MichaelGoddard · ‎02-28-2024

This is another post in my SAS Viya Topologies series. This time we will look at using a two-node pool configuration and how to get the desired topology. We will examine sharing a node pool for Compute and CAS processing. For this I tested using both a SMP CAS Server and using a MPP CAS Server. For this testing I was working in the Microsoft Azure Cloud and using the SAS Infrastructure as Code GitHub project to build the Kubernetes cluster. Let’s have a look at the details and how to get the desired topology. Desired topology As the title suggests, the desired topology was to use two node pool for my SAS Viya deployment. That is, a node pool for the microservices and a single node pool for the Compute and CAS pods. My goal was to use “small” commodity VMs for all services other than the Compute and CAS engines (pods). This is shown in the image below. Select any image to see a larger version. Mobile users: To view the images, select the "Full" version at the bottom of the page. I had an objective of simplifying the deployment topology by using a single node type, node pool, for the “compute-tier”. The rationale for using a single node pool for the Compute and CAS processing is that they have similar node requirements. Nodes with ample CPU and memory with local ephemeral disk. It is also a recognition that you can share the nodes by letting Kubernetes control the pod scheduling. Deployment Decision A key deployment, or architectural decision, is how to implement the single node pool for the compute-tier? To minimise the “custom configuration” I wanted to make use of the standard SAS Viya workload class labels and taints where possible. So, should you configure the Compute pods to run on the CAS nodes, or is it better to configure CAS to run on the compute nodes? I have previously written about Creating custom Viya topologies – Part 2 (using custom node pools for the compute pods), in that blog I discussed the need to update the following configuration: sas-compute-job-config sas-batch-pod-template sas-launcher-job-config sas-connect-pod-template Two additional considerations that I would like to highlight when moving the compute pods are, firstly the prepull function for the SAS programming environment container image, and a second consideration is whether SAS Workload Management (WLM) will be enabled. WLM requires at least one node to have the “compute” workload class label. With the above in mind, and based on my testing, the best and/or simplest approach is to configure the CAS pods to run on the “compute” nodes. That is, nodes that have the ‘workload.sas.com/class=compute’ label and taint applied. Realising the topology – IAC configuration As I said in the introduction, I was working in the Azure Cloud and used the SAS Viya 4 Infrastructure as Code (IaC) for Microsoft Azure GitHub project to build the AKS cluster. The image below shows the node pool definitions that I used for my testing. Here you can see the compute node pool definition is using the Standard_E8ds_v4 instance type, this provides 8 vCPUs with 64GiB of memory and 300GB of SSD Temp storage. This will be used for the Compute and CAS pods. The nodes have the standard labels and taint applied for the compute nodes. The second node pool is called generic and does not have any labels or taints applied. It is using the Standard_D4s_v4 instance type. It provides 4 vCPUs with 16GiB of memory, it was my “commodity” VM instance. Note, this node pool has “max_nodes” set to 20, this isn’t needed for SAS Viya to run, in fact using this instance type the deployment spun up 6 nodes. Tip! When using the IAC and defining nodes without any label or taint, you still must specify the “node_labels” and “node_taints” parameters, with null values, as shown above. Realising the topology – SAS Viya configuration The advantage of using the standard compute node configuration is that you only need to focus on the CAS configuration. Let’s look at what is required. The core of the configuration is that CAS pods need to target the compute nodes and must have a toleration for the compute (workload.sas.com/class=compute) taint. For this I used the require-cas-label.yaml as a template to configure required scheduling to use the compute label. The ‘require-cas-label.yaml’ can be found in the ../sas-bases/overlays/cas-server folder. I also used this to set the tolerations for the CASDeployment. The following is the configuration that I used. Line 10 is highlighted and shows the definition for the required scheduling. The example transformer in sas-bases only has the first -op: add statement, which provides the configuration to target CAS nodes, I updated this to target the Compute nodes. In addition to this update, on lines 16 – 36 you can see the update that I added to replace the tolerations. This configuration also illustrates a change that was introduced at Stable 2023.05 (May 2023), the addition of two new workload classes for CAS. For this, lines 23 – 29 set the tolerations for the controllerTemplateAdditions and lines 30 – 36 set the tolerations for the workerTemplateAdditions. As you can see the tolerations should be set in three definitions now, not just on the controllerTemplate definition. Here is the template should you need to copy and paste it. # PatchTransformer to make the compute label required # in addition to the azure system label --- apiVersion: builtin kind: PatchTransformer metadata: name: require-compute-label patch: |- - op: add path: /spec/controllerTemplate/spec/affinity/nodeAffinity/requiredDuringSchedulingIgnoredDuringExecution/nodeSelectorTerms/0/matchExpressions/- value: key: workload.sas.com/class operator: In values: - compute - op: replace path: /spec/controllerTemplate/spec/tolerations value: - effect: NoSchedule key: workload.sas.com/class operator: Equal value: compute - op: replace path: /spec/controllerTemplateAdditions/spec/tolerations value: - effect: NoSchedule key: workload.sas.com/class operator: Equal value: compute - op: replace path: /spec/workerTemplateAdditions/spec/tolerations value: - effect: NoSchedule key: workload.sas.com/class operator: Equal value: compute target: group: viya.sas.com kind: CASDeployment name: .* version: v1alpha1 In addition to the PatchTransformer above, you also need to set the tolerations for the sas-cas-pod-template. This is done using the following configuration (set-cas-pod-template-tolerations.yaml). # Patch to update the sas-cas-pod-template pod configuration --- apiVersion: builtin kind: PatchTransformer metadata: name: set-cas-pod-template-tolerations patch: |- - op: replace path: /template/spec/tolerations value: - effect: NoSchedule key: workload.sas.com/class operator: Equal value: compute target: kind: PodTemplate version: v1 name: sas-cas-pod-template The two PatchTransformers shown above form the core of the configuration to use the Compute nodes for the CAS pods. An additional consideration is whether to use the CAS auto-resources configuration. I don’t recommend doing this for a couple of reasons. Firstly, and most importantly from my testing using the CAS auto-resourcing affect the Compute prepull function from operating. Secondly, the auto-resourcing is intended to dedicate nodes to the CAS pods, and this configuration is looking to share the nodes (between CAS and SAS programming workloads), so it doesn’t make sense to implement the auto-resourcing. See the Deployment Guide: Adjust RAM and CPU Resources for CAS Servers. However, there is a final configuration that I would recommend. You should set the resource requests and limits for the CAS pods and implement Guaranteed Quality of Service (QoS). Implementing Guaranteed QoS provides additional protection for the CAS pods and ensures that they will not be killed by the out-of-memory (OOM) processing. The Compute pods will be evicted from the nodes should an out-of-memory situation occur. It should be noted that the sas-compute pods are transient, this is a normal configuration, not something that is an unintended consequence of using the two-node pool topology. If a Compute pod gets evicted it just affects one user, while if a CAS pod is evicted it will have an impact on all CAS users (depending on the CAS Server configuration and how the data has been loaded). To set the CAS pod requests and limits you can use the cas-manage-cpu-and-memory.yaml example in the ../sas-bases/examples/cas/configure folder. To implement the Guaranteed QoS you set the requests and limits to the same value. For my environment I was using the Standard_E8ds_v4 instance type, this provides 8 vCPUs with 64GiB of memory. For my testing I set the memory requests and limits to 48GiB and the CPU requests and limits to 6. This is shown in the example below. # This block of code is for adding resource requests and resource limits for # memory and CPU. --- apiVersion: builtin kind: PatchTransformer metadata: name: cas-manage-cpu-and-memory patch: |- - op: add path: /spec/controllerTemplate/spec/containers/0/resources/limits value: memory: 48Gi - op: replace path: /spec/controllerTemplate/spec/containers/0/resources/requests/memory value: 48Gi - op: add path: /spec/controllerTemplate/spec/containers/0/resources/limits/cpu value: 6 - op: replace path: /spec/controllerTemplate/spec/containers/0/resources/requests/cpu value: 6 target: group: viya.sas.com kind: CASDeployment # Uncomment this to apply to all CAS servers: name: .* # Uncomment this to apply to one particular named CAS server: #name: {{ NAME-OF-SERVER }} # Uncomment this to apply to the default CAS server: #labelSelector: "sas.com/cas-server-default" version: v1alpha1 Using this configuration will leave 2vCPU and 16GiB of memory for other pods. By default, each compute session will request 50millicores and 300MB of memory. Finally, the kustomization.yaml needs the following updates to implement the configuration. For my environment I used a ‘cas’ folder under ‘/site-config’ to hold the configuration. The configuration needs to be added to the transformers section. For example. transformers: : - site-config/cas/require-compute-label.yaml - site-config/cas/set-cas-pod-template-tolerations.yaml - site-config/cas/cas-manage-cpu-and-memory.yaml Looking at the results I tested using a SMP CAS Server and MPP CAS Server. One of the nice things about the MPP CAS Server deployment was that there were now multiple compute nodes available for the Compute pods. For example. Here you can see my MPP CAS deployment, a Controller with 4 Workers, are all running in the compute node pool, the compute nodes. Each on different compute nodes due to the CPU and memory resource reservations and pod anti-affinity settings. To further test the configuration, I started two SAS Studio sessions, you can see that one sas-compute pod started on compute node vmss000001 and the other session started on node vmss000003. This is highlighted in the yellow box. In this second example, I deployed a SMP CAS Server. The first SAS Studio session is using the same node as the CAS Server (vmss00000g). I then manually scaled the Compute node pool to have two nodes. Once the second node was ready, I started a second SAS Studio session, you can see that it is using the vmss00000h node. Finally, I wanted to confirm the CAS pod configuration. For this I used the kubectl describe pod command. Here you can see that the sas-cas-server pods have the requests and limits set as configured. Conclusion Hopefully this demonstrates that it is relatively simple process to configure the SAS Viya deployment to share a node pool for the Compute and CAS pods. I see this type of configuration mainly being used for Visual Analytics deployments supporting a small number of programmers. For large environments supporting many programmers and/or heavy CAS processing, or environment looking to further optimise the deployment then dedicated Compute and CAS node pools would still be used. It should be noted that configuring the node pools using the IaC is a relatively trivial process, so a valid question is whether the added configuration complexity is worth the effort? I will let you decide that. But if your customer wants to limit the number of node pools, it is possible. Finally, to recap, for a scenario where a shared node pool is desired for CAS and Compute: Disable the cas auto-resourcing when sharing the nodes for both Compute and CAS workloads. Manually configure the CAS CPU and memory requests and limits. I recommend using Guaranteed QoS so that the CAS pods are not killed by any OOM processing. Find more articles from SAS Global Enablement and Learning here.

Online Status	Offline
Date Last Visited	yesterday

SAS Container Runtime observability using SAS Enterprise Session Monit...

Introducing the SAS Container Runtime Batch Agent

SAS SpeedyStore Observability using SAS Enterprise Session Monitor

Re: Changing the default storage for SAS Viya

Using Grafana dashboards for monitoring SAS SpeedyStore

Running SingleStore Studio within the SAS Viya namespace – Part 2

Changing the default storage for SAS Viya

High Availability considerations for SAS Container Runtime

High Availability considerations for MAS on SAS Viya

Using Guaranteed QoS with SAS Viya

Using NFS Premium shares in Azure Files for SAS Viya on Kubernetes

SAS Container Runtime observability using SAS Enterprise Session Monit...

Introducing the SAS Container Runtime Batch Agent

SAS SpeedyStore Observability using SAS Enterprise Session Monitor

Using Grafana dashboards for monitoring SAS SpeedyStore

Changing the default storage for SAS Viya

SAS Container Runtime observability using SAS Enterprise Session Monit...

Introducing the SAS Container Runtime Batch Agent

SAS SpeedyStore Observability using SAS Enterprise Session Monitor

Re: Changing the default storage for SAS Viya

Using Grafana dashboards for monitoring SAS SpeedyStore

Running SingleStore Studio within the SAS Viya namespace – Part 2

Changing the default storage for SAS Viya

High Availability considerations for SAS Container Runtime

High Availability considerations for MAS on SAS Viya

Using Guaranteed QoS with SAS Viya

Running SingleStore Studio within the SAS Viya namespace

Running SAS Viya on a shared Kubernetes cluster – Part 2

Re: Creating custom SAS Viya topologies – realizing the workload place...

Running SAS Viya on a shared Kubernetes cluster – Part 1

SAS Viya topologies: sharing a node pool for Compute and CAS