A handful of suggestions for enhancing the end user experience for SAS Viya 2021.2 and later

5 Likes

In this short post we’ll quickly identify some levers and considerations for bolstering the Viya end user experience. It’s important to note this is not an exhaustive list of levers and considerations AND that SAS Viya itself (server and client-side) is constantly evolving. As such, what may be true/accurate today may not be so, a few months down the line. So the bottom line is to always consult the latest documentation, posts and technical support notes on a regular basis.

Think about network entry and exit points for users connecting to Viya in a public Cloud

For some customer organizations, the dispersion of the user community across a wide geographical area compared to the region where Viya has been deployed can be a factor in the overall user experience.

With that in mind, the customer organizations IT teams need to consider how best to route traffic to groups of users. Some potential Cloud specific services are listed below. Offerings like Azure’s “Front Door” works at Layer 7 (HTTP/HTTPS) to route user traffic as quickly as possible. This can help keep latency between the end users and the Viya server to a minimum. Similar features are also offered by AWS and GCP .

As an aside, for a comparison of services from Azure, AWS and GCP, see these useful resources from Azure. Azure & AWS or Azure & GCP.

Getting the Viya applications to load quicker and be environment specific on the users’ desktop/laptop?

If you've not heard of Progressive Web Apps before then check it out. From SAS’s documentation:

You can install SAS Visual Analytics as a Progressive Web App (PWA). The benefits of using SAS Visual Analytics as a PWA include the following:

Application persistence – By default, your session will never time out, so you can restart your work more quickly. Your administrator can control the timing with a configuration property in SAS Environment Manager.
Performance – When installed as a PWA, SAS Visual Analytics initializes faster than when accessed in the browser.
Desktop experience – Installing SAS Visual Analytics as a PWA confers all the benefits of a traditional installation. You can launch SAS Visual Analytics from the Start menu or Taskbar, and you do not need to sort through countless tabs to find the correct instance of SAS Visual Analytics
Renaming – You can rename each PWA instance to quickly access different environments, such as development, test, or production servers.

Thinking about the Viya server(s) configuration and infrastructure

As with all multi-user and analytical software, the end user experience is heavily influenced by the configuration of the underlying software and of course the hardware infrastructure. And it should be added potentially the configuration of third party software may also be a key driver.

A good starting point that will inevitably change over time, is the Viya4 Tuning Guide.

From the Tuning Guide you will see references to topics such as:

How to set upper limits to the number of Compute sessions each user (user & super user) can run concurrently. See SAS_LAUNCHER_USER_PROCESS_LIMIT & SAS_LAUNCHER_SUPER_USER_PROCESS_LIMIT
SAS Compute pod settings for settings aligned to specific contexts allow different pod limits and requests for CPU & RAM. See here
Tuning the Open Distro for ElasticSearch component
Tuning the CAS Server
Tuning the SAS Infrastructure Data Server (PostgreSQL)
Tuning the JDBC connections
Tuning the LDAP connections

To complement the Viya Tuning Guide, there is the Micro Analytical Service tuning guidelines

Compute Server Pods: "Pre-Pull" and "Pool of Available Servers"

When users of SAS Studio and Model Studio start a session, a Compute Server pod is made available to those users. With this in mind, it’s useful for users to know what can be done to speed up the availability of the Compute Server pods. At present two concepts can help those users, namely “pre-pulling container images” and “pool of available Compute Servers”.

"Pre-Pull" Compute Server images

The container image for Compute Server is measured in gigabytes. As such, it is sufficiently large that movement of such an image from container repository to a node that will run Compute Server pods could take several minutes. To mitigate the impact of that movement, The idea is to "pre-pull" (known as “SAS Image Staging”) the Compute pod image during the SAS Viya deployment, so it will be already available on the Nodes at the end of the deployment. For more details read this @RPoumarede post and the official documentation .

Pool of Available Servers

Now for the second concept which is “Pool of Available Servers”. From the official documentation SAS Compute servers that are configured to be reusable can also be configured to have a minimum number of servers that are available. This configuration enables the SAS Compute service to maintain a set of running compute servers that can be reused. Whenever an environment is started, the number of servers that are specified are also started.

In short where the “Pool of Available Servers” configuration has been set, users of SAS Studio and users of Model Studio running 2 or more pipelines in parallel are likely to experience benefits. For SAS Studio users, their Compute Context/Session is likely to start quicker. For Model Studio users, they are likely to see their pipelines run in parallel as soon they submit the flows.

Compute Server Pods: CPU and RAM limits

So far we have spoken about concepts that help users get to the point of using their analytical clients in a more timely fashion. What can we consider to improve the performance of jobs being submitted to the SAS Compute Server?

For customer organizations that will be using SAS Studio based sessions and batch jobs, Viya users and administrators are encouraged to use different compute contexts, for different types of workloads. For example when a SAS Studio program or flow contains processing that would benefit from multiple threads e.g. Proc Sort, Proc Summary, Proc SQL, Proc GLM, then use a Compute Context that has a large number of cores than the default Compute pod settings. And as a reminder options such as Memsize, Sortsize and Sumsize are also important when running such code .A simple approach would be to have a small, medium, large and Xlarge aligned compute contexts. These would for example have pod limits of 1 CPUs + 1GB RAM, 1 CPUs + 2GB RAM, 4 CPUs + 4GB RAM and 8 CPUs + 8GB RAM.

As an aside remember to take into account how K8s pod CPUs map to the infrastructure. One CPU, in Kubernetes, is equivalent to:

1 AWS vCPU
1 GCP Core
1 Azure vCore
1 Hyperthread on a bare-metal Intel processor with Hyperthreading

Typically the resource values would be lower than the limit sizes for these pods in this scenario. Some informal testing has shown that there are performance gains that can be realized for a given set of workloads submitted to the Compute Server. When thinking about increasing CPU and RAM limits for Compute pods, you will need to be mindful of the likelihood of diminishing returns in terms of performance gains. In part, this is down to the I/O throughput constraints/limitation of the attached storage being read from and written to. Previous published research by @MargaretC in SAS 9.4 indicated that 4 CPUs (4 cores) and 1 GB per core was sufficient. It is worth mentioning that the use of options like UTILLOC to leverage distinct & dedicated storage locations may prove to be beneficial from a performance perspective. And for Cloud based K8s environments, you need to be clear about vCPUs and their alignment to physical cores and threads for a given instance type e.g. on Azure see this page.

And since late 2021, customers can now license SAS Workload Management to enhance the ability to manage and prioritize workloads. Indeed the ability to queue jobs based on a number of parameters can give users confidence their jobs will be executed even if the system is particularly busy. For more information see the SAS Workload Management documentation & recent posts from GEL folks.

Visual Analytics usage

As with other SAS usage, creating content that is optimized for a given visual or programming interface. This short article does not go into such practices but other resources on that topic are available. For this short post we call out one specific feature, namely SAS Report Packages.

For reports that are based on data that change on a relatively slow cadence e.g. perhaps every 8 – 24 hours, then the use of SAS Report packages maybe appropriate. From a report consumer perspective, the snappiness of such report packages may provide a better end user experience because there is little to no working being done by Viya services or a CAS Server. From the documentation

You can export a report and a snapshot of your data and query results for a report from SAS Visual Analytics to a SAS report package. You do not have to export an entire report. Instead, you can select individual objects, containers, and pages that you want to export. You can then use the SAS Visual Analytics SDK to embed the report information into your custom web pages and portals. For more information about the SDK, see SAS Visual Analytics SDK for developers.

The report package is a compressed file named report-package-name.sasreportpkg.zip that includes a snapshot of the report, images, style sheets, and data that are needed for a given set of report objects. The package also includes all of the data that is needed to render the selected objects, as well as the data that is required to support any actions that are defined between the objects.

Visual Data Mining and Machine Learning usage

For data mining and machine learning focused users of SAS Model Studio, the user experience has continued to be enhanced over the lifetime of SAS Viya 4. Knowing how Model Studio works for given use cases or scenarios can help both individual users and all users working concurrently on the system.

The first use case is well known and documented in terms of publications and posts. In short, it relates to users leveraging ‘training’, ‘validation’ and ‘test’ data sets in their analysis. For such use cases, the general recommendation is to have variables that can be mapped to the roles of Partition (e.g. _PARTIND_ ) and Key (e.g. _DMINDEX_ ) already existing in the data table prior to being loaded into CAS for use with Model Studio. If this is not the case, then the table of data will be written to the CASDATADIR location on the CAS Server Controller. The impact of writing such tables to the CAS Server Controller can be a negatively impactful to all users of said CAS Server, if the data table(s) in question are ‘large’. In this context ‘large’ can be tables considered to use more than 40% of the available disk space available to the CASDATADIR location on the CAS Server Controller. During the writing of the table to this location the CAS Server Controller and Workers can become busy, as can the network which connects them.

See this SAS documentation section and this post for more details.

The second use case is for running parallel pipelines in a Model Studio project. As with the SAS Visual Forecasting notes below, there is a setting that allows Viya administrators to increase the Maximum Concurrent Nodes (the setting name is sas.analytics.flows.maximumConcurrentNodeExecution) from it’s default value of 5. See the documentation. This can be helpful if users are planning on creating Model Studio Projects with more than 5 parallel pipelines. Increasing this setting value along with use of “Pool of Available Servers” and “Reusable Servers” can bring benefits, as described in this post by @RobCollum.

SAS Visual Forecasting usage

For customers who use SAS Visual Forecasting, two concepts that can help users have a good experience are listed below. Again, this is not an exhaustive list by any means, but they are certainly two that are worth mentioning. The first is “data shuffling” / “data movement”. The second is concurrent users/pipelines running modelling strategies.

From this section within the documentation: The actual data processing runs on a CAS server. In a distributed CAS environment, the time series are delineated and shuffled based on the distinct combination of values for the BY variables. The time series data is processed in parallel. It is written out to CAS tables on each worker node. Furthermore, threads are used on each worker node to process the time series vectors that are loaded onto a node concurrently.

What does this mean in practice? Imagine we had 5,000 SKUs and each SKU has 2 years of daily data on average (700+ data points per SKU). To do the forecasting, each SKU’s data would be shuffled or moved to one of the available CAS Server Workers (and lets assume we are using MPP CAS Server with 8 Workers). Once all the SKUs’ data points have been placed together on one of the 8 Workers, then the forecasting can begin. So for example imagine we want to forecast beer sales for the SKUs “Hop-tastic”, “Majestic Malt” and “Cwrw Braf”. Then all of “Hop-tastic” data will be CAS Worker 2, “Majestic Malt” data on Worker 5 and “Cwrw Braf” on CAS Worker 8. For more details read these SGF papers by members of SAS's R&D teams (a, b) . From this we can see users who are working with large tables of data may encounter a period of waiting time before the forecasting algorithms are applied. Users working with very, very large data, can consider how some of the "data shuffling" (by-group partitioning) can be done ahead of time or at at quiet times of the day to make best use of the available time for analysis. From the documentation: "You can use the partitioning feature in CAS to partition the table once and use that partitioning over and over again. The partitioning feature in CAS provides a more permanent and efficient solution for grouping data. The work required for partitioning does not have to be repeated as it does when using BY groups. When you partition a table using the partition action, it becomes a partitioned in-memory table that can be accessed by subsequent operations."

The second concept is the number users or pipelines that can be run concurrently. From the official documentation for the section headed "Modeling Strategies Take a Long Time to Complete" :

There are several factors that can influence the performance of modeling strategies during a pipeline run. In some circumstances, if there are a lot of users running pipelines at the same time, node execution time can be affected. By default, a maximum of five users can run the modeling strategies simultaneously. For any additional users, the modeling strategies are placed in Pending state until a run completes and another node can be started.

The SAS Viya administrators can help their forecasting colleagues by changing increasing the Maximum Concurrent Nodes setting to be a value greater than the default number of 5. In addition to adjusting this value, the administrator may also want to consider increasing the number of “Pool of Available Servers” and “Reusable Servers” to facilitate the quicker use of Compute Servers.

SAS Studio usage

SAS Studio is great environment for programmers and low/no-code developers of data management and analytical content.

Running some the steps in a flow can be optimized and users can get predictable results by using the submission order of a flow

At the current release of SAS Studio within Viya 4 (2021.2.5) it is not possible to run flows in parallel as with SAS 9.4 based technology as described by @EdoardoRiva in his SGF paper. However, if users want to run other workloads in addition to their primary SAS Studio flow, then they can use the ‘Background Submit’ feature. Here’s what the SAS Studio documentation has to say:

You can run a saved program, query, task, or flow as a background submission, which means that the file can run while you continue to use SAS Studio. You can view the status of files that have been submitted in the background, and you can cancel files that are currently running in the background.

This is a great feature and one that can certainly help users be more productive. With this in mind, it is important the users and administrators are aware of what is happening in the background and the tie-back to SAS_LAUNCHER_USER_PROCESS_LIMIT. @DavidStern wrote this useful post on the topic:

Note: Because a background submission uses a separate compute server, any libraries or tables that are created by the submission do not appear in the Libraries section of the navigation pane in SAS Studio.

In practice, if the SAS_LAUNCHER_USER_PROCESS_LIMIT is set to say ‘5’, then a SAS Studio user could have their main session + 4 Background submissions.

Wrapping it-up

Well I was hoping this would be a short post ‌‌ but if you decided to read it all, you may well now have spent 10 mins reading this. And if you have read it, but your still not sure what all of this means, then feel free to ask a question in the comments section. Or simply logon to a Viya environment and explore some of these proposed tips and tweaks for yourself. And as always if you have other ideas on this topic or would think some of what I've written needs further discussion, then feel free to add a comment below.

Thanks, Simon

Find more articles from SAS Global Enablement and Learning here.

AllenCunningham · ‎05-30-2024

Great article! Thank you!

touwen_k · ‎05-31-2024

Thank you for this article. If adding more compute nodes, for example for SAS Studio, is it correct that you need more cores than memory? We are reconfiguring our resources in order to use SWO. Pls advise if possible the dependencies between CPU and memory or provide a link.

SimonWilliams

Hi @touwen_k ,

Thanks for you comment and question.

It's good to hear you are beginning to use SAS Workload Orchestrator (SWO).

When you ask "If adding more compute nodes, for example for SAS Studio, is it correct that you need more cores than memory?" I'm not quite sure what is being asked. Typically we see nodes being used that have 8 cores with 64GB of RAM or 16 cores with 128GB of RAM. But there are smaller nodes with say 4 cores with 16GB of RAM that some customers may find perfectily acceptable for their use cases.

There are ways to set the maximum number of jobs per vCPU and depending on the types of workloads and their priorities, the use of WLM can certainly help manage Compute workloads. And of course the use of queues and how they can be aligned with different node pools.

I think you may find these papers useful:

https://communities.sas.com/t5/SAS-Communities-Library/An-Autoscaling-Experience-on-SAS-Viya/ta-p/89...

https://communities.sas.com/t5/SAS-Communities-Library/Scaling-to-new-heights-Exploring-the-auto-sca...

https://communities.sas.com/t5/SAS-Communities-Library/Azure-Spot-Instances-for-SAS-Workload-Managem...

Cheers, Simon