SAS software has always been very flexible to meet the needs of different workloads, performance requirements, user expectations, and more. To do so, SAS Viya is flexible to work with deployments ranging from one server to hundreds. Determining the actual number of host machines for the SAS solution is not always easy. In my previous post, we looked at the possibilities when forced to fit Viya within 1, 2, or 3 host machines.
What we learned is that while SAS Viya can scale down to run in that limited environment, we must necessarily give up some functionality to make it all fit. This is a valid choice - but one we must ensure the customer understands properly for best result.
In this post, let's tackle this same concept from a different angle - this time, let's look at a large-scale implementation where hosts provide not just scalability and availability but are tuned specifically for the specialized software roles employed.
It's time to reiterate that we should not simply plan a Viya deployment based on an arbitrary number of host machines, regardless if they're few or many. While we can get to a functional deployment that way, it's not going to provide the ideal experience.
Instead, discuss all aspects of requirements and expectations with your customer. And, of course, get a sizing for your customer’s solution performed by the SAS Enterprise Excellence Center. The EEC will ask the questions to determine the workloads involved and return with a recommendation for hardware specifying CPU and RAM to meet typical performance expectations. The shape that the hardware actually takes based on those recommendations as well as other considerations of the business is then something we will need to work with.
With these formalities addressed, then let's look at what we have to work with.
Viya - and CAS in particular - are designed to accommodate massive scalaility - running across many hosts as well as enabling increasing compute capacity easily by adding even more hosts. A lot of machines implies significant investment in terms of hardware costs, ongoing operations, administration, and even moreso with software licensing, user training, data management, and so much more.
When customers make that kind of signficant investment, they want to protect it. And so oft times, there will be requirements to ensure the ongoing availability of the system to minimize unexpected outages. High availability considerations therefore go hand-in-hand with scalability considerations.
Furthermore, SAS Viya is comprised of many disparate software technologies. Compare CAS with its massively parallel processing which is highly dependent on all working data available in RAM with the SAS Programming Runtime Environment which uses the more classic model of disk-based data storage and access. Both are considered computation engines, and yet they function very differently. Tuning a host ideally for one may mean that the other won't run as efficiently as it could. Fortunately, the Viya architecture allows us to separate these computation engines to different hosts - if we choose. And we can extend that concept to other aspects of Viya as well. If you're familiar with multi-tier deployments of SAS 9, this is a similar concept applied to new technologies.
Viya offers the ability to deploy the SAS software in preset groupings referred to in Ansible technology as host groups. This allows us to break up the software deployment across a number of host machines. Understanding which software components populate each host group then is necessary to devise a deployment of Viya where hosts can be tuned especially for the software they will run.
Select any image to see a larger version.
Mobile users: To view the images, select the "Full" version at the bottom of the page.
In this scenario, we deploy the SAS Viya software to separate hosts which can be optimized for their specific tasks.
[Host groups: consul, httpproxy, pgpoolc, rabbitmq, sasdatasvc]
Why 3 hosts? The SAS Configuration Server software is a specially packaged version of Consul from Hashicorp. Consul is designed to implement HA through a consensus methodology which relies on an odd number of hosts (that is 1, 3, 5, not 2, 4, 6) to prevent the split-brain problem. The same concept also applies to RabbitMQ, which is the technology behind SAS Message Broker.
Deployment: Place the Consul and RabbitMQ host groups on all 3 hosts. The other stateful services need only be placed on any 2 of the hosts for their improved availability. So place sasdatasvc on 2 hosts and rabbitmq on 2 hosts (with only 1 overlap). The pgpoolc software currently cannot be clustered yet (yes, it's still a single point of failure - to be addressed in Viya 3.5). Place pgpoolc on the host without sasdatasvc instances.
[Host groups: AdminServices, CASServices, ComputeServices, CoreServices, DataServices, HomeServices, ReportServices, ThemeServices, configuratn, and many more]
In general, the SAS microservices are mostly built using Spring Boot technology running on Java. This is not required, but common for Viya right now. Future microservices may be built in Go or any other HTTP RESTful friendly technology.
Why 2 hosts? To ensure availability in case one physical host goes down.
Deployment: Place all microservice host groups on both hosts.
[Host groups: Operations, ComputeServer, programming]
The SPRE provides a runtime environment for execution of classic SAS program code.
Why 2 hosts? To ensure availability in case one physical host goes down.
Deployment: Place all SPRE host groups on both hosts - with the exception of the operations microservice. It's currently not clusterable (again, single point of failure) and can only deploy to a single host.. Add more hosts if needed at time of initial deployment.
[Host groups: sas-casserver-primary, sas-casserver-secondary, sas-casserver-worker]
CAS is our flagship product for massively scalable processing of huge-volumes of in-memory data.
Why 5 hosts? We need 2 hosts to provide improved availability of the CAS controller role (primary and secondary). While MPP CAS will function with a single worker, that's inefficient and doesn't provide any worker failover. Two workers gives us that minimum redundancy - however, in my opinion, we should always have more worker hosts than controller hosts - so 3 workers then. Add more if/when needed.
Deployment: Place each CAS Controller on separate physical hosts. Place each CAS Worker on a separate physical host. Add more (or remove) hosts at any time.
Perhaps your customer doesn't need that much specialization and optimization of hosts. Maybe they don't plan to use the SPRE very much. Or it may be that they're new to using Viya and understand needing some specialization as well as accommodating scalability and availability, but want to keep things conceptually simplified. So let's compromise.
In this scenario, we've placed infrastructure services (stateful and microservices) together on one set of hosts along with the computational services (SPRE and CAS) on another set of hosts.
This is a very common deployment for the Viya infrastructure services.
Deployment: Place the Consul host group on all 3 hosts. The other infrastructure services need only be placed on any 2 of the hosts for their improved availability. For the stateful services, follow the guidance shown above for 12 hosts. For the microservices, try to spread them evenly based on their memory-usage.
Both CAS and the SPRE are considered runtime environments. But they work very differently in actual operations.
Deployment: Place the SPRE components on the CAS Controller hosts. The rationale here is that the CAS Controller hosts are not typically used as heavily as the CAS Workers. Assuming identical host sizes between controllers and workers, then the free overhead might be sufficient room for the SPRE.
Your customer wants a scalable CAS deployment with rudimentary availability improvements. And they don't expect a large number of users as much as they expect a few users to perform work on large volumes of data. Then we can tackle that, too.
Running everything together - but with some careful placement decisions to smooth out the kinks.
Deployment: Place two instances of the Viya infrastructure services on the same hosts alongside the SPRE and the CAS Controllers - except for Consul. In a crazy twist, let's place the Consul host group on 3 of the CAS Workers.
To be frank, I don't know. There's a good chance none of those shown here or in my previous post are exactly right for your customer's requirements and expectations. The point of these posts is to convey some of the architectural concepts behind different deployment decisions so that you can work with your customer to design your own scenario.
Even within a single customer project, you may need to make use of multiple scenarios at the same time. Production environments vs. Dev/Test environments. Or environments optimized for data and analytic processing as compared to environments which specialize in report delivery and consumption. Throw in ESP, MAS, other SAS solutions, and the possibilities are nigh endless.
Contrary to popular belief, SAS Viya does not really handle random startup or shutdown order of its constituent services very well. When everything is installed on a single host machine, then the sas-viya-all-services script will correctly handle startup and shutdown order. That specific use-case works so well, that the default is to configure the operating system to automatically start Viya services at host startup.
But if you distribute the Viya software components across multiple host machines, then there's a real problem. First is that the sas-viya-all-services script does not talk across machines. So it's not for "all services on all machines", but for "all services on this machine". Furthermore, there are some dependencies between Viya software services - some weak (they'll figure it out on their own) and some very strong (startup order matters! Else it just breaks). So we need to disable the automatic startup of Viya services on those hosts and then work to ensure each of the Viya services are started in proper order to meet those dependencies.
To help ensure proper startup and shutdown order of SAS Viya services, we recommend implementing the VIRK's Viya Multi-Machine Services Utilities Playbooks. In that Github repository, you can find a set of playbooks to start or stop the SAS Viya services gracefully across the 1 - n machines that are identified in the inventory.ini file. Share and enjoy!
When deploying Viya, some decisions must be made at the time of initial deployment. For example, if your customer might want multi-tenancy later, then you must decide whether to enable multi-tenancy right now at initial deployment -or- deploy Viya as single-tenant now, but then re-deploy from scratch as multi-tenant later. Some aspects of Viya infrastructure clustering have similar considerations - what you decide at initial deployment can have an impact later.
Some quick examples of Viya software which needs clustering decisions made at initial deployment:
Viya service: | Decide now or later: | Notes: |
---|---|---|
RabbitMQ | Now | Clustering RabbitMQ after the initial deployment is a manual process with constraints |
SAS Cloud Analytic Services | Later | Scalable after initial deployment, even from SMP to MPP |
Apache HTTP Server | Later | Remember running multiple httpproxy requires a 3rd-party load balancer, not provided with Viya |
SAS Programming Runtime Environment | Now | For multi-tenant deployments of SAS Viya, adding ComputeServer hosts after initial deployment is not yet supported. |
Later | Adding more hosts to the ComputeServer after the initial deployment is supported for single tenant deployments of SAS Viya |
To keep a simple rule in mind, the Viya stateful services are critical and they typically are less forgiving of mis-configuration or other challenges. So try to determine their ultimate deployment topology as early as possible and then stick with it. The microservices are generally more accommodating of changes since they're designed to be resilient in that way already. And then CAS has been designed with future scalability in mind, so scaling it out is relatively easy. As a matter of fact, hosts can be added to (or removed from) an existing MPP CAS deployment without any interruption in service.
The SPRE is a a little complicated as it's not a single component, but many spread across different host groups in the inventory.ini file. And its deployment considerations vary on several key factors. See my article, Deploying the SPRE in SAS Viya 3.4 for more details.
SAS Viya offers a dizzying range of options for architecture, deployment, and operation. A couple of articles are not sufficient to address all of the options. However, we can all improve our knowledge our how the components of Viya work together and then use that to benefit our customers with a deployment plan which is tailored to their specific needs. My next blog post will describe the considerations weighed for an actual customer implementation. See Deploying SAS Viya in the real world.
Registration is open! SAS is returning to Vegas for an AI and analytics experience like no other! Whether you're an executive, manager, end user or SAS partner, SAS Innovate is designed for everyone on your team. Register for just $495 by 12/31/2023.
If you are interested in speaking, there is still time to submit a session idea. More details are posted on the website.
Data Literacy is for all, even absolute beginners. Jump on board with this free e-learning and boost your career prospects.