Solved: Re: What's the argument for installing a SAS platform on multiple virt...

VasilijNevlev · Posted 07-12-2017 05:15 AM

Hello,

Medium size SAS platform is typically installed on several servers, so Metadata tier goes on one server, Mid-tier goes on to another server and then you have a compute tier on one or several servers.

When the servers were physical machines, it was important to break the load down between the servers becasue of the bottle necks on the hardware side. On the assumption that most of these bottlenecks don't apply to deployments in datacentres on virtualised servers, what is the argument for separarting the deployment between multiple virtual machines?

Just to stress the point, I am talking about small to medium deployment here.

Regards,

Vasilij

=======================================
For more information about {An}alytium, visit https://www.analytium.co.uk

JuanS_OCS · Posted 07-14-2017 05:59 AM

Hello @VasilijNevlev,

from my understanding, the separation between the "tiers" (metadata, compute and middle-tier/web) is as separating life and work, let us say. Separation is mostly an enabler and protecter feature.

Performance: the more you separate (as far as it is covered by your license/contract) the better use of your hardware and software resources. Even on a VM environment, this applies. Let us not forget virtualization is great, advisable, but not exactly perfect yet. And let us not forget that even on VM environments, there are hardware considerations and bottlenecks (Storage in the SAN, Local storage, LUNs, RAID X, stripping, switches, disk controlers, differences on Mhz between all of them, etc etc).
You can also protect your software and the access,on a modular way. Or to create better backups, with a custom schedule for each component/storage
But it can potentially enable much more nice features. Some examples:
1. Operational/Tactics: Tenancy: if you have different fuctional groups/tenants, you can configure and control the resources each of them will use, depending on their independent budgets.
2. Strategy: HA and escalability : the clustering, HA (if needed) and escalability can be better configured and at lower costs if you split/modularize your SAS tiers.

Of course, the final deployment it is an agreement between aceptance/requirement of those features, budget and costs on maintenance (not only hw, also workforce or even software). Therefore that discussion between IT and the business is important to happen.

View solution in original post

Kurt_Bremser · Posted 07-12-2017 06:35 AM

SAS server licenses are priced on computing power (# of cores). By putting the CPU-intensive midtier (web application server) on a separate server, in my understanding one can safely increase CPU power there without having to upgrade the SAS license for the compute tier.

Keep in mind that the web and web application server are based on apache and tomcat (GPL licensed).

Maxims of Maximally Efficient SAS Programmers
How to convert datasets to data steps
The macro for direct download as ZIP
How to post code
Please vote for Provide Sequential Search Capability for Hash Objects
How to deal with locked files on UNIX

VasilijNevlev · Posted 07-13-2017 03:58 AM

Hello Kurt,
That is a good point, thank you.
Regards,
Vasilij

=======================================
For more information about {An}alytium, visit https://www.analytium.co.uk

SASKiwi · Posted 07-12-2017 04:25 PM

In my experience one of the main reasons for having multiple servers is the differing load requirements. The metadata server CPU load is light but it needs quick memory response so as not to slow down apps. The SAS Compute / App server is CPU and IO heavy and the mid-tier server is memory-intensive.

If you put all of these SAS servers on one virtual server, the danger is any resource overload with SAS applications could easily make your SAS environment totally unusable as there would be no resources left for metadata or mid-tier.

We operate a small SAS platform on multiple virtual servers (4 cores) and performance is very good.

VasilijNevlev · Posted 07-13-2017 04:06 AM

Hello SASKiwi,
What about prioritising the server processes? To force metadata server to have higher priority over other components?
There is also a counter argument to you: if I have a server with 12 CPUs (VPS in a large datacetre), it might make sense to keep it all running on one box so then underused resources of metadata server and web server will be automatically allocated to compute servers if and when need arises.
What's your thoughts?
Vasilij

=======================================
For more information about {An}alytium, visit https://www.analytium.co.uk

Kurt_Bremser · Posted 07-13-2017 04:24 AM

From my experience, the metadata server is not really that much of a resource consumer

FYI: our current server was last booted up on April 21. Since then, the metadata server has consumed 18.5 CPU minutes, while the web application server (tomcat) for WRS and Studio has eaten 427 CPU minutes, and the Environment Manager 263. From the values I get from watching the workspace servers perform, I can assume that the Base SAS workload for that time period is way beyond that.

IMO, you can safely have the metadata server on the compute server (if it is not a grid, of course).

Maxims of Maximally Efficient SAS Programmers
How to convert datasets to data steps
The macro for direct download as ZIP
How to post code
Please vote for Provide Sequential Search Capability for Hash Objects
How to deal with locked files on UNIX

SASKiwi · Posted 07-14-2017 08:48 PM

I'm not an expert in this area, but there is a lot more to it than just performance optimisation as @JuanS_OCS has described very well.

One topic not mentioned is SAS licensing costs. SAS is normally licensed by the number of cores on the SAS App server (normally the highest number). If you install SAS on a single virtual server and add extra cores for handling the metadata and mid-tier then this will cost you more than if you had a multiple server setup.

VasilijNevlev · Posted 07-17-2017 03:09 PM

Hi SASKiwi,

That makes perfect sense, also Kurt menioned the licensing point. You and him are suggestig to put the components that are not under license on to a separate machine to get more CPU compute cycles per license dollar spent. This is an excellent point.

Regards,

Vasilij

=======================================
For more information about {An}alytium, visit https://www.analytium.co.uk

ronan · Posted 07-17-2017 10:38 AM

I don't pretend to reply for my esteemed colleague, this is a just a personal thought of my own.

Setting up affinity rules (e g on RHEL 6/7 Control groups keeping apart some CPU power for SAS compute sessions | SAS Metadata server | SAS midTier etc.) might be effective - if authorised by the terms of your SAS contract. However, this would imply ensuring that the IT admins never forget to apply these rules when a change (upgrade, hotfix etc.) occurs at the system level and this is a strong assumption, almost a risk per se. What if, for instance, after a mere WE reboot, the Cgroups rule on CPU is lost and the MidTier processes now occupies between 40% and 60% of the CPU workload instead of, say 20% as defined previously ?... Keeping track of these kind of rules is difficult so, relying of the strict separation of tiers, each tier on a separate machine (physical/virtual) is much more easier, and not so costly (VM).

As a rule of thumb, SAS TS generally recommends dedicating a single machine to the Medata Server : as someone has said previously, this should not be taken as a strong pre-requisite; MD Server load being light, is compatible with MidTier of Compute Tier from my experience. Only environments with High Avaialabiliy specs need a separate MD server, sometimes even clustered.

As @JuanS_OCS has said, also, in order to take benefit of scalability, installing the Compute Tier on a distinct machine enables further to add more power (in the due limits of the contract) just by adding another server and setting up the Load Balanced Workspace also called Clustered Workspace which creates a computing cluster sas-wise. No SAS Grid or Hadoop cluster required, only the extra machine and the SMC (a Shared File System is also almost required).

See https://support.sas.com/documentation/cdl/en/biasag/63854/HTML/default/n07001intelplatform00srvradm....

JuanS_OCS · Posted 07-14-2017 05:59 AM

Hello @VasilijNevlev,

from my understanding, the separation between the "tiers" (metadata, compute and middle-tier/web) is as separating life and work, let us say. Separation is mostly an enabler and protecter feature.

Performance: the more you separate (as far as it is covered by your license/contract) the better use of your hardware and software resources. Even on a VM environment, this applies. Let us not forget virtualization is great, advisable, but not exactly perfect yet. And let us not forget that even on VM environments, there are hardware considerations and bottlenecks (Storage in the SAN, Local storage, LUNs, RAID X, stripping, switches, disk controlers, differences on Mhz between all of them, etc etc).
You can also protect your software and the access,on a modular way. Or to create better backups, with a custom schedule for each component/storage
But it can potentially enable much more nice features. Some examples:
1. Operational/Tactics: Tenancy: if you have different fuctional groups/tenants, you can configure and control the resources each of them will use, depending on their independent budgets.
2. Strategy: HA and escalability : the clustering, HA (if needed) and escalability can be better configured and at lower costs if you split/modularize your SAS tiers.

Of course, the final deployment it is an agreement between aceptance/requirement of those features, budget and costs on maintenance (not only hw, also workforce or even software). Therefore that discussion between IT and the business is important to happen.

VasilijNevlev · Posted 07-17-2017 03:15 PM

Thank you @JuanS_OCS

I am surprised you are saying virtualisation isn't perfect yet. I guess it depends on implementation, but from my experience virtualisation doesn't tend to have obvious bottlenecks when compared to dedicated implementation, but then it is all about the set up I guess.

I feel other features you mentioned are nice to have, but they don't have a noticable benefit to the business. In case of small deployments, it just might make sense to keep things simple, it is only with bigger deployments where multi-tenancy, HA and fragmented backups are really beneficial.

Thank you for your comments, really useful to keep in those in mind.

Regards,

Vasilij

=======================================
For more information about {An}alytium, visit https://www.analytium.co.uk

What's the argument for installing a SAS platform on multiple virtualised SERVERS?

Re: What's the argument for installing a SAS platform on multiple virtualised SERVERS?

Re: What's the argument for installing a SAS platform on multiple virtualised SERVERS?

Re: What's the argument for installing a SAS platform on multiple virtualised SERVERS?

Re: What's the argument for installing a SAS platform on multiple virtualised SERVERS?

Re: What's the argument for installing a SAS platform on multiple virtualised SERVERS?

Re: What's the argument for installing a SAS platform on multiple virtualised SERVERS?

Re: What's the argument for installing a SAS platform on multiple virtualised SERVERS?

Re: What's the argument for installing a SAS platform on multiple virtualised SERVERS?

Re: What's the argument for installing a SAS platform on multiple virtualised SERVERS?

Re: What's the argument for installing a SAS platform on multiple virtualised SERVERS?

Re: What's the argument for installing a SAS platform on multiple virtualised SERVERS?