Now that Viya 3.2 is out, we have the first components of the SAS Environment Manager allowing us to monitor all our services, including, or course, all the microservices. As I reported on my previous blog, the sas-viya-all-services script and service makes it easier to manage and monitor the new Viya 3.2 services as well.
It turns out that there are a couple of cases where the Environment Manager won't provide the information you need, and you can use the script sas-viya-all-services to rescue your troubleshooting efforts.
You're probably aware of the new component called the SAS Configuration Server, (also called consul) which acts as a store for all the configuration information that the other microservices (and web applications) need. But, it also helps us monitor all the other services as I'll show you in this blog.
As a first pass you would normally check the Environment Manager Dashboard, which will inform you of services down:
If you see something amiss as in the screenshot above, the next step is to go to the Resources->Machines and Services->Services page to get the details and determine which specific services are down:
However, there's more than meets the eye. Scrolling through the list of services, as shown above, you will find two services that appear to be related:
Here's what they look like in the list:
It turns out that these two services are critical in reporting to you which other services are available or not.
The way to identify all services on any machine is with the following command:
All SAS Viya-related services (that includes servers such as the web server too) are installed in this directory, and begin with the string "sas-viya-".
Using this command, if you then dive down one more layer to the OS level and ask for services on each machine, you will find the following:
sas-viya-configuration-default, a services that's typically on just one machine:
sas-viya-consul-default, (see above) found on every machine in the deployment.
But you don't find anything called "sas-viya-configuration-server", or similar.
What's going on here?
Well, the Configuration Service is a microservice, which is reponsible for managing changes to configuration data. That means when an administrator modifies anything stored in the configuration server, it goes through this service, which updates the server (data is stored as name-value pairs in the server). Typically it's on the first "Viya" machine (not the CAS Controller or Worker) in a deployment. Like all microservices, you're only required to have one Configuration Service, which can be on any Viya machine, but in the interests of fail-over and throughput, it's easy to deploy more than one, especially in larger and more complex deployments.
The service called sas-viya-consul-default actually is the SAS Configuration Server, but it can operate in two modes: as a server (storing all those name-value pairs), or as a configuration "agent", meaning that it picks up data about the other services running on its machine, and passes that information to the server, which may be on a different machine. The start-up parameters for that process determine whether it's currently acting as a server or as an agent, and even when it's in the role of server, it also does the "agent" work of scanning the local machine for information about all the other services on that machine. That way you don't have to have a "server" and an "agent" on the same machine; you just have one sas-viya-consul-default service on that machine, doing either one job or doing both jobs.
And of course for HA and possible performance enhancement, a deployment can and often will include multiple servers.
A possible "gotcha" with all this is that if an agent on any of the machines in the deployment goes down, you won't see that in the Environment Manager interface, since it only reports on the sas-viya-consul-default that's acting in the role of a server (typically one instance in a deployment). But if an agent goes down, it will appear that all the other services on it's machine are down as well, since the agent is the thing that reports on the status of all the other microservices on its machine.
All of which brings us back to the usefulness of the sas-viya-all-services script:
The first thing to know is that if you see more than one service down from the Environment Manager interface, you should run the sas-viya-all-services to find out exactly what's happening. Chances are that the agent is down, causing the other services to appear down. But, even more importantly, sometimes if a microservice goes down, it will not show as down in Environment Manager; instead that service will just not appear at all in Environment Manager, obscuring the fact that it's down. In other words, rather than showing in Environment Manager as a red "X", that service just won't be listed.
In either of the above cases, the sas-viya-all-services script comes to the rescue, providing a list of all services on any given machine, and indicating their status. Here's an example of a case where Environment Manager indicates that sas-viya-backup-agent is down, when in fact it's sas-viya-consul-default (as an agent) that's actually down, whereas the backup agent is not:
In either case, ( 1) wrong service indicated down or 2) service down but not showing up in EV) you can then use the sas-viya-all-services script to restart any downed services on a machine, or you can use the script specific to the service if you prefer (if there's only one that you need to restart):
The combination of the new Environment Manager interfaces, plus the sas-viya-all-services script makes it easy to monitor and manage all your Viya-related services.