SAS Workload Manager provides advanced workload orchestration to the SAS Viya platform. To deliver these capabilities, it makes intensive use of many platform and infrastructure services. This exposes it to a wide range of possible external failures and instabilities. Software releases throughout 2025 have focused on making it both faster and more stable, thanks to improved performance optimizations and better resilience. The result is a massive increase in software quality and user experience.
Let’s review the enhancements across different areas.
Improvements have targeted key pain points where heavy system use could lead to noticeable delays. By focusing on the core mechanisms of job processing and orchestration, bottlenecks under high load have been reduced.
Updates of job status have reduced from 75ms to 75μs (milliseconds to microseconds) – a 1,000x improvement! Previously, in a test environment with 1,000 concurrent jobs, the SAS Workload Orchestrator manager service would be occupied for 75 seconds when updating the job status (persisted in the SAS infrastructure Data Server). Currently, the same operation only takes 75 milliseconds in total, i.e. 75μs per job. This massive improvement was primarily achieved by optimizing the SQL code used for the updates, by creating new indexes for some database tables, and by moving certain activities to dedicated background threads.
Another area that has been improved is the time taken by SAS Workload Manager service to start up. Previously, this was impacted by the number of jobs that were still running when the services were last stopped. Recovering from the database the status of 100 jobs could take up to 30 seconds, and during that time the service was not responsive to client requests. Currently, in the same test environment, SAS Workload Management services are ready in about 6 seconds, independently of the number of jobs.
Additional optimizations include using separate threads to create pods in parallel, instead of serially, and reducing the timeout during the communication to launched pods to get resource information, so that processing is not blocked for too long in case of network issues.
Starting with SAS Viya stable release 2024.08, the SAS Workload Orchestrator Manager service has been re-engineered to become stateless; this includes offloading state information into the SAS Infrastructure Data Server.
This change has increased the overall utilization of the underlying database server.
Starting with SAS Viya stable 2025.01, SAS Workload Management provides better database management, including the ability to delete old job records - only for completed jobs, never for active jobs. The records can be deleted manually by using the sas-viya CLI, or they can be deleted automatically based on parameters that SAS administrators can set in the SAS Workload Orchestrator configuration. By default, old jobs records are automatically deleted either after 60 days, or when the database tables exceed 100,000 records.
This optimization reduces the records in the history table of the database.
Even when migrating from older releases with many old records stored in the database tables, software optimizations prevent a single, massive deletion operation that could block service startup for a few minutes. In those cases, SAS Workload Orchestrator submits ‘delete’ commands in batches of a few thousand records, so that the initial cleanup is spread out across multiple hours, without overwhelming the system.
Manual job deletion and configuration for the parameters controlling automatic deletion can be controlled with the CLI. Starting with SAS Viya 2025.09, this capability has been added to the SAS Workload Orchestrator page in SAS Environment Manager.
The SQL optimization that we have already discussed in the performance section has the additional benefit of reducing the database table fragmentation, decreasing the amount of space that SAS Workload Management tables consume.
All these improvements do not remove the requirement to perform routine maintenance tasks for the PostgreSQL database: a healthy platform needs periodic maintenance to achieve optimum performance!
SAS Workload Orchestrator is closely integrated with various services, including event publishing through sas-arke and RabbitMQ, ongoing interactions with PostgreSQL, and frequent read/write operations to the Kubernetes API. As a result, SAS Workload Orchestrator is often among the first services affected during periods of environmental instability. Disabling SAS Workload Orchestrator may temporarily resolve certain issues (e.g., SAS Studio users can successfully connect to a backend session that otherwise could not be started), but this comes at the expense of losing the advanced functionalities that SAS Workload Management provides. In these scenarios, SAS Workload Orchestrator serves as an early indicator of broader system instability: it is impacted by these disruptions rather than causing them. The appropriate solution should be to address the underlying external service issues, rather than disabling SAS Workload Orchestrator.
To enhance its resilience, SAS Workload Orchestrator now incorporates additional checks and retry mechanisms to better manage and withstand instabilities in external services.
Examples:
SAS Workload Orchestrator has also enhanced the user experience by providing more robust management of failures encountered when initiating submitted jobs. Now, if Kubernetes or the execution host fail to start a job (condition caught with the error code: HOST_FAILED), then SAS Workload Orchestrator automatically re-submits the job to a different node. Note that this is different from automatic requeuing of restartable jobs.
Finally, SAS Workload Orchestrator and the SAS launcher service include better retry logic to handle cases when called services return an error.
SAS Workload Management documentation has been enhanced, too. Improvements include:
These enhancements are in addition to existing documentation, such has the SAS Workload Management page of the troubleshooting guide: https://go.documentation.sas.com/doc/en/sasadmincdc/default/calts/p04x9ud5y3wec3n1l8dxpwvtbdl2.htm
With these latest enhancements, SAS Workload Orchestrator continues to advance reliability, efficiency, and user experience for organizations leveraging the SAS Viya platform. These improvements are designed to ensure your operations run more smoothly and with greater transparency. For more details and hands-on guidance, be sure to explore the updated documentation linked above.
Find more articles from SAS Global Enablement and Learning here.
It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.
The rapid growth of AI technologies is driving an AI skills gap and demand for AI talent. Ready to grow your AI literacy? SAS offers free ways to get started for beginners, business leaders, and analytics professionals of all skill levels. Your future self will thank you.