In the previous article of this series, we looked into SAS Grid Manager and SAS Viya integration points, to understand how they can be leveraged together. As we stated there, SAS Grid Manager and SAS Viya implement distributed computing according to different computational patterns. This post covers more details on how do they compare, complement, or differ in providing highly available and scalable computing with high levels of performance.
To better compare SAS Grid Manager and SAS Viya, we can highlight several capabilities:
SAS Grid Manager uses queues to manage jobs, both to decide which ones to start on which hosts, and to manage jobs already running. Queues are used to assign different policies, such as priorities, resource requirements, and permissions. When resources are constrained, jobs can be held in the queues in order to avoid overloading the execution hosts with too many competing requests. As a result, the workload from multiple users is dynamically and efficiently managed.
SAS Viya relies on the operating system for concurrent activity management. SAS Viya provides options to limit resource utilization, such as CPU and memory. Although these capabilities provide basic forms of prioritization and resource management, SAS Viya in the current release does not provide proper workload management.
SAS Grid Manager |
SAS Viya |
|
|
SAS Grid Manager usage of a clustered shared filesystem simplifies adding or removing grid nodes. With SAS Viya, you can scale the CAS engine both by adding additional worker nodes to an existing instance and by defining additional CAS server instances.
Select any image to see a larger version.
Mobile users: To view the images, select the "Full" version at the bottom of the page.
CAS Server can scale almost linearly to handle increasing data volumes by adding additional worker nodes
For both Grid and Viya, infrastructure services can be clustered to support increasing numbers of users.
While grid deployments can scale on any supported operating system, clustering capabilities are not available with SAS Viya on Windows.
SAS Grid Manager |
SAS Viya |
|
|
SAS Grid Manager is easier to maintain than traditional SAS 9 computing platforms. You can take a grid node out of service, for example, to apply an operating system patch, without impacting grid functionality. Jobs are only dispatched to the remaining online nodes. At the end of the planned maintenance, you can add back and re-open grid nodes to accept new jobs. The same is true for clustered Viya services: you can take offline individual members, as long as each cluster maintains a minimum quorum. Nodes are automatically re-added to their cluster as they come back online. With a distributed CAS server, worker nodes can be stopped, without impacting running analysis, and added back live, after performing the maintenance.
|
SAS Viya |
|
|
High availability has always been a key capability of SAS Grid Manager; both services and jobs can be monitored and moved to surviving nodes in case of failure. SAS Viya addresses availability concerns by providing clustering capabilities for all services. If a member of a cluster goes down, all the other members keep servicing client requests. When CAS is deployed in an MPP cluster, CAS server can maintain multiple copies of data, distributing them on different workers. If a worker becomes unavailable, the controller can instruct other workers to activate their local copies and all tables remain available.
|
SAS Viya |
|
|
Let’s compare parallelization capabilities through an example. Assume you have a serial sequence of steps – let’s say a data step, followed by three data preparation procedures, two analytical models, and finally a report with the results.
You can reorganize the steps since some of them can run in parallel. When the re-organized job is submitted to SAS Grid Manager, it runs as many steps as possible concurrently; the parallelized sequence may terminate in a fraction of the time of the original one.
CAS tackles parallelization differently. When a CAS server is installed in an MPP architecture, data is split evenly in chunks distributed across the cluster nodes. Large analytic problems can be spread by CAS simultaneously across many machines. Each node can produce results faster because it has to analyze only a subset of the data. In the end, the CAS controller collects and summarizes all intermediate results before sending them back to the client.
|
SAS Viya |
The sequential steps of a job (left) can be transformed into parallel execution (right).
|
A distributed CAS cluster (right) can produce results faster than a single host (left), thanks to data parallelism.
|
SAS Grid Manager and CAS address parallelization by implementing two different forms of distributed computing.
The example in the previous section highlights how SAS Grid Manager and SAS Viya implement parallelism using two complementary approaches: the former uses task parallelism, the latter data parallelism.
Task Parallelism is when you have the concurrent execution of independent tasks on multiple computing cores or hosts.
Data Parallelism leverages the concurrent execution of the same task on each of multiple computing cores or hosts, on different subsets of the data to be analyzed.
|
SAS Viya |
Task Parallelism
|
Data Parallelism
|
SAS Grid Manager and SAS Viya provide end-users and administrators many advanced capabilities. Although sometimes these are addressed using different paradigms, SAS Grid Manager and SAS Viya can both provide an efficient and highly available environment that ensures rapid results and optimal resource utilization.
To further understand how to get the most from both an architecture and an administration perspective, you can get additional information from my SGF 2020 paper SAS® Grid Manager and SAS® Viya®: A Strong Relationship, or its accompanying video:
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Data Literacy is for all, even absolute beginners. Jump on board with this free e-learning and boost your career prospects.