BookmarkSubscribeRSS Feed

Deploying SAS Viya on 1, 2, 3 servers

Started ‎11-08-2019 by
Modified ‎11-11-2019 by
Views 3,806

Our customers come to SAS for unique and powerful software solutions to address their data, analytics, and reporting challenges. Our software has to run on hardware which the customer is responsible for providing whether that's on-premise, in the cloud, or through a third-party provider. Sometimes the business environment can dictate how that hardware is provisioned in a such a way that we must adapt the SAS solution software to make the best fit possible. Fortunately, SAS software is flexible to a very wide range of deployment scenarios.

 

This means that sometimes, often for proof-of-concepts or budget-constrained implementations, we find that a SAS Viya deployment project may come with a pre-determined number of server machines. This is usually not ideal. And so it's important to understand what this can mean. So let's try to answer the questions presented by, "My customer gave me X machine(s) for Viya. How do I deploy the software?"

Don't let the cart get before the horse

You know what I'm gonna say here:  get a sizing for your customer's solution performed by the SAS Enterprise Excellence Center. The EEC will ask the questions to determine the workloads involved and return with a recommendation for hardware specifying CPU and RAM to meet typical performance expectations. The shape that the hardware actually takes based on those recommendations as well as other considerations of the business is then something we will need to work with.

 

Whenever possible, let's ensure we've done our due diligence in advance of the customer buying hardware. Once they've committed to a specific layout of server hosts, it can be difficult to make changes later. Work on identifying all requirements for your solution before designing a hardware solution to support it.

 

So now that we've acknowledged how the hardware situation should be determined in typical implementation, then let's look at what happens when we put the cart before the horse - trying to fit Viya software on an arbitrary number of machines.

 

horse-cart.png

 

The scenarios described below do not cover all possible use cases. This post focuses primarily on deployment patterns which meet original design intent as well as some common anti-patterns.

CAS deployments and tenants

With SAS Viya 3.3, we gained the ability to configure multi-tenant environments. Each tenant runs with their own CAS configuration, dedicated memory space, alongside a shared set of Viya infrastructure services. There is no limit on the number of tenants (or users within those tenants).

 

With SAS Viya 3.4, we also gained the option to install multiple CAS deployments. Each CAS deployment is physically separate from the others (meaning each is installed on its own dedicated host(s)). Keep in mind that each CAS deployment counts toward the total number of CPU cores which have been licensed for CAS overall at the site.

 

If we choose, we can combine these concepts so that each CAS deployment provides analytic services to multiple tenants.

One server

With a single server machine for Viya, we will place all of the Viya infrastructure services as well as one deployment of SMP Cloud Analytic Server together on one host. The requirements of production workload running well in a single machine environment will often necessitate some beefy server specifications.

 

1-1.png

Select any image to see a larger version.
Mobile users: To view the images, select the "Full" version at the bottom of the page.

Scenario:

1-1-a = All on one host machine
For all-on-one host deployments of SAS Viya 3.4, both Linux and Windows Server operating systems are supported by SAS. With all of the Viya software together on one host, this means SMP Cloud Analytic Server is the only option.

 

In the future, SAS expects to offer the ability to run Viya infrastructure services across multiple Windows Server hosts. So Viya infrastructure services may be clusterable and SMP CAS may also be deployed to separate, dedicated host(s). MPP CAS on Windows Server isn't currently on the roadmap.

 

Given a choice, we almost always will prefer Linux over Windows Server. Most importantly, running Viya on Linux gives the option of scaling up from SMP CAS to MPP CAS without having to redeploy Viya from scratch.

Two servers

Compared to a single host, we can expect that two servers will perform twice as much work, handle twice as many users, process twice as much data, right? Well, maybe… mapping functional operations to physical resources is more nuanced than that. But having two servers does have usable benefits beyond straight performance.

One CAS deployment

Even with just one deployment of SAS Cloud Analytic Server, we have multiple options on how to install it.

 

2-1.png

Scenarios:

  • 2-1-a = Separate hosts for Viya infrastructure and SMP CAS
    Place all of the Viya infrastructure services together on one host. Install SMP CAS on the other host. Optimize each host for its purpose.
  • 2-1-b = Install MPP CAS on 2 hosts
    We can deploy MPP CAS to run on two host machines: one host dedicated to run as a CAS Worker and the other as CAS Controller alongside all of the Viya infrastructure services. The problem with this is that a two host MPP CAS deployment is the least efficient possible. Running in MPP mode with only two hosts involves unnecessary orchestration overhead for what is a single server's workload. Avoid this scenario.

Two CAS deployments

With two servers now we also have the option of two CAS deployments: installing an SMP CAS on each host. One SMP CAS will live alongside the Viya infrastructure services and the other on a dedicated machine.

 

2-2.png

Scenario:

2-2-c = Two deployments of SMP CAS
With one host dedicated to SMP CAS already, we could also deploy a second SMP CAS to the Viya infrastructure host. Obviously, that SMP CAS would share physical RAM and CPU when running alongside the Viya infrastructure services. Smaller jobs for fewer users is best there. The deployment of SMP CAS on the dedicated host could be optimized for larger workloads.

Three servers and beyond

As we add more servers to these scenarios, the possibilities multiply. With each additional host, we have options to consider which can significantly improve analytics processing as well as service isolation and availability.

One CAS deployment

3-1.png

 

With three hosts, we've reached the minimum cluster size for running MPP Cloud Analytic Server.

Scenarios:

  • 3-1-a = Viya infrastructure on one host with MPP CAS on all three hosts
    This is typical for a small multi-host deployment of Viya. The MPP CAS Controller doesn't typically consume as much RAM and CPU as the CAS Workers, so it should reside comfortably enough alongside the Viya infrastructure services on the same host.
  • 3-1-b = Attempt high availability of Viya infrastructure and CAS Workers
    Enable clustering of Viya infrastructure services and rely on built-in data availability offered by the CAS Workers. For Viya 3.3, you could make the case that this would be the smallest footprint possible to achieve high availability. But remember that with Viya 3.4, we now have the CAS Secondary Controller to improve availability of that role - and it's not shown in this illustration because it would require a fourth host. Regardless, this approach prioritizes availability over performance - which is not typical. Avoid this scenario.

Two (or more) CAS deployments

3-2.png

 

If your site requires more than one CAS deployment using only three servers, then SMP CAS is the only option.

Scenarios:

  • 3-2-d = Viya infrastructure on one host with SMP CAS on two hosts
    Dedicate one host for Viya infrastructure services and deploy two SMP CAS. Optimize each host for workload.
  • 3-2-e = Viya infrastructure on one host with SMP CAS on all three hosts
    This could be functional, but it's getting out of hand. If your site has this much CAS computation requirement, then revisit deploying MPP CAS and consider adding more hosts, multi-tenancy, etc. Avoid this scenario.
  • 3-2-f = Attempt high availability of Viya infrastructure and SMP CAS
    Enable clustering of Viya infrastructure services and manually manage analytics and data availability across multiple SMP CAS. While automatic failover of most Viya infrastructure services would be provided (exceptions: pgPool-II and the operations microservice), ensuring availability of CAS resources to end users would require significant manual effort. Instead, consider deploying MPP CAS, adding more hosts, etc. if availability is a significant requirement. Avoid this scenario.

What about other SAS computational engines?

That's a great question. This post has so far ignored considerations for the implementation of the SAS Programming Runtime Environment, the SAS Event Stream Processing Server, the SAS Micro-Analytics Service, the SAS Embedded Process, and more. They each present their own factors and requirements to contemplate in the overall solution architecture. Let's take a quick look at the SPRE...

 

3-1-c.png

Scenario:

3-1-c = Viya infrastructure on one host, SMP CAS on one host, and SPRE on one host
In the other illustrations for this post, the SAS Programming Runtime Environment (consisting of SAS Launcher Server and Service as well as the SAS Compute Server and Service) is not shown, but assumed to be deployed to the same host machine as the rest of the Viya infrastructure services. However, if you expect significant workload for the SPRE, then its components can be deployed to one or more additional servers. The number of licensed CPU cores the SPRE Compute Server can run on is the same count as for CAS. Optimize each host for its workload.

Ready to add more servers

The promise of Viya is that it utilizes many modern approaches to software architecture. A key tenet of this is the ability to scale the solution easily (with "easily" being a relative term best compared to SAS 9 equivalents). In particular, it is possible to scale up CAS from SMP mode on a single host to MPP running as a cluster across multiple hosts with a planned, short-term outage simply by running the necessary Ansible playbook. If you already have MPP CAS and want to add more Workers, this can be accomplished with no outage at all.

 

And don't forget that if growing the Viya deployment from one host to many is a future goal, then ensure your customer opts to deploy using Linux-based hosts, not Windows Server.

Other takeaways

The SAS Cloud Analytic Server was designed from its inception to handle the largest analytic workloads. It's fast and extremely scalable. Performance therefore, is its raison d'être. For best performance, consider starting with at least four server machines with one host for the Viya infrastructure services and three hosts for MPP CAS. This provides separation of Viya infrastructure services from the CAS workload and allows optimization of hosts for their respective purposes.

 

For smaller deployments of Viya - three or fewer machines - then we acknowledge that huge data processing performance is really no longer the primary driving factor. Accommodating cost effectiveness, constrained budget, limited data volume, few users, short project lifespan, or similar is the goal. It's great that Viya is flexible to these other business drivers… we just need to ensure our customers understand the tradeoffs for the selected approach.

 

High availability and disaster recovery considerations will again drive up the number of necessary server hosts. When scaling out and increasing the number of hosts in the cluster for performance, then we often can gain advantages in terms of availability as well.

 

My next post illustrates SAS Viya deployments with four or more host machines. See Deploying SAS Viya on more servers.

Coda

SAS Viya offers so much flexibility in terms of architecture, deployment, and operation that it's a real challenge to run down all of the possible options. This post covers only a select few of the possible combinations offered when deploying on up to only three hosts - and we typically see most multi-machine deployments starting with at least four hosts, with more to come later.

 

It's important to understand the crucial driving factors of your customer implementation of SAS Viya. Hardware choices will have a significant impact on Viya's ability to accomplish the goals desired.

Comments

Thanks a lot for this article, this is very helpful and timely. Choosing a relevant topology for Viya is not that easy 🙂

 

Looking forward to reading your next article.

Version history
Last update:
‎11-11-2019 09:12 AM
Updated by:
Contributors

sas-innovate-2024.png

 

Time is running out to save with the early bird rate. Register by Friday, March 1 for just $695 - $100 off the standard rate.

 

Check out the agenda and get ready for a jam-packed event featuring workshops, super demos, breakout sessions, roundtables, inspiring keynotes and incredible networking events. 

 

Register now!

Free course: Data Literacy Essentials

Data Literacy is for all, even absolute beginners. Jump on board with this free e-learning  and boost your career prospects.

Get Started