SAS Grid Manager and SAS Viya: Comparing Capabilities

1 Like

In the previous article of this series, we looked into SAS Grid Manager and SAS Viya integration points, to understand how they can be leveraged together. As we stated there, SAS Grid Manager and SAS Viya implement distributed computing according to different computational patterns. This post covers more details on how do they compare, complement, or differ in providing highly available and scalable computing with high levels of performance.

To better compare SAS Grid Manager and SAS Viya, we can highlight several capabilities:

workload management
scalability
ease of maintenance
availability
parallelization

Workload Management

SAS Grid Manager uses queues to manage jobs, both to decide which ones to start on which hosts, and to manage jobs already running. Queues are used to assign different policies, such as priorities, resource requirements, and permissions. When resources are constrained, jobs can be held in the queues in order to avoid overloading the execution hosts with too many competing requests. As a result, the workload from multiple users is dynamically and efficiently managed.

SAS Viya relies on the operating system for concurrent activity management. SAS Viya provides options to limit resource utilization, such as CPU and memory. Although these capabilities provide basic forms of prioritization and resource management, SAS Viya in the current release does not provide proper workload management.

SAS Grid Manager

SAS Viya

Queues to manage jobs
- policies
- priorities
- resource requirements
- permissions
Jobs can be held in queues
Jobs are assigned to the grid node with the best available resources

Options to set global limits on resource utilization
Options to control CAS table size and CPU consumption
CAS relies on the OS for concurrent activity management
Resource management, not workload management

Scalability

SAS Grid Manager usage of a clustered shared filesystem simplifies adding or removing grid nodes. With SAS Viya, you can scale the CAS engine both by adding additional worker nodes to an existing instance and by defining additional CAS server instances.

Select any image to see a larger version.
Mobile users: To view the images, select the "Full" version at the bottom of the page.

CAS Server can scale almost linearly to handle increasing data volumes by adding additional worker nodes

For both Grid and Viya, infrastructure services can be clustered to support increasing numbers of users.

While grid deployments can scale on any supported operating system, clustering capabilities are not available with SAS Viya on Windows.

SAS Grid Manager

SAS Viya

Easily add/remove grid nodes
A shared filesystem greatly simplifies grid scalability
Infrastructure services can be clustered
Can scale on any supported operating system

Options to set global limits on resource utilization
Options to control CAS table size and CPU consumption
CAS relies on the OS for concurrent activity management
Resource management, not workload management

Ease of Maintenance

SAS Grid Manager is easier to maintain than traditional SAS 9 computing platforms. You can take a grid node out of service, for example, to apply an operating system patch, without impacting grid functionality. Jobs are only dispatched to the remaining online nodes. At the end of the planned maintenance, you can add back and re-open grid nodes to accept new jobs. The same is true for clustered Viya services: you can take offline individual members, as long as each cluster maintains a minimum quorum. Nodes are automatically re-added to their cluster as they come back online. With a distributed CAS server, worker nodes can be stopped, without impacting running analysis, and added back live, after performing the maintenance.

SAS Grid Manager

SAS Viya

Easier to maintain than traditional SAS 9 solutions
Grid nodes can be taken offline for maintenance.
Jobs are only dispatched to online nodes
A shared filesystem simplifies patching

Easily take offline members of clustered services for maintenance
CAS server re-shuffles in-memory data amongst the remaining workers
All running analysis and user sessions keep working unaffected
Automatically re-accepts a node when it comes back online

Availability

High availability has always been a key capability of SAS Grid Manager; both services and jobs can be monitored and moved to surviving nodes in case of failure. SAS Viya addresses availability concerns by providing clustering capabilities for all services. If a member of a cluster goes down, all the other members keep servicing client requests. When CAS is deployed in an MPP cluster, CAS server can maintain multiple copies of data, distributing them on different workers. If a worker becomes unavailable, the controller can instruct other workers to activate their local copies and all tables remain available.

SAS Grid Manager

SAS Viya

If a grid node goes down, jobs are moved to other nodes
Essential services can be monitored to:
- restart failed services
- initiate a failover procedure on another node
Failed batch jobs can be:
- automatically resubmitted
- resumed from the last good checkpoint

Clustering to improve availability
- automatic detection of failed services
- stateless services to simplify session management across nodes
CAS can maintain multiple copies of data: if a node becomes unavailable, another worker can recover the lost data from its local cache

Parallelization

Let’s compare parallelization capabilities through an example. Assume you have a serial sequence of steps – let’s say a data step, followed by three data preparation procedures, two analytical models, and finally a report with the results.

You can reorganize the steps since some of them can run in parallel. When the re-organized job is submitted to SAS Grid Manager, it runs as many steps as possible concurrently; the parallelized sequence may terminate in a fraction of the time of the original one.

CAS tackles parallelization differently. When a CAS server is installed in an MPP architecture, data is split evenly in chunks distributed across the cluster nodes. Large analytic problems can be spread by CAS simultaneously across many machines. Each node can produce results faster because it has to analyze only a subset of the data. In the end, the CAS controller collects and summarizes all intermediate results before sending them back to the client.

SAS Grid Manage

SAS Viya

The sequential steps of a job (left) can be transformed into parallel execution (right).

A distributed CAS cluster (right) can produce results faster than a single host (left), thanks to data parallelism.

SAS Grid Manager and CAS address parallelization by implementing two different forms of distributed computing.

Different Forms of Distributed Computing

The example in the previous section highlights how SAS Grid Manager and SAS Viya implement parallelism using two complementary approaches: the former uses task parallelism, the latter data parallelism.

Task Parallelism is when you have the concurrent execution of independent tasks on multiple computing cores or hosts.

Data Parallelism leverages the concurrent execution of the same task on each of multiple computing cores or hosts, on different subsets of the data to be analyzed.

SAS Grid Manag

SAS Viya

Task Parallelism

Independent tasks are performed on the same or different data.
The computation is asynchronous: As soon as a task is done, the node is available to perform another one.
Parallelization is proportional to the number of independent tasks that can be performed.
Optimum load balancing is achieved by tuning the algorithms used to schedule the right jobs on the best available hosts, given the available resources.

Data Parallelism

The same task is performed on different subsets of same data.
The computation is synchronous: All subtasks have to be completed before it is possible to move to the next step.
Parallelization is driven by how data is distributed between compute workers.
Optimum load balancing depends upon the capability of multiple parallel tasks to synchronize with the controller and between themselves.

Conclusion

SAS Grid Manager and SAS Viya provide end-users and administrators many advanced capabilities. Although sometimes these are addressed using different paradigms, SAS Grid Manager and SAS Viya can both provide an efficient and highly available environment that ensures rapid results and optimal resource utilization.

To further understand how to get the most from both an architecture and an administration perspective, you can get additional information from my SGF 2020 paper SAS^® Grid Manager and SAS^® Viya^®: A Strong Relationship, or its accompanying video: