BookmarkSubscribeRSS Feed

High Availability with SAS Viya: Service Discovery and Routing

Started ‎09-17-2018 by
Modified ‎09-17-2018 by
Views 4,702

Increasing the High Availability capabilities of the SAS platform and enhancing end-users and administrators' experience in case of failure are key goals for SAS Viya.

 

Viya servers and services can be clustered to increase their availability. With clustering, if a member of the cluster goes down, all of the other ones keep servicing requests. But how can a client know how many instances of service are running? And where are they (i.e. on which host and port are they listening)?

 

Let's start with some theory

Service discovery and routing is built on the idea that clients should not need to know the physical location of services. This concept originated within cloud environments, where services may be started on demand or moved to a different host at any time. But it makes perfect sense also when dealing with High Availability clusters: clients should not be directly aware of any load balancing or failover required to access a service.

 

Within Viya, this is possible because the Apache HTTP Server is the front door to all web applications and microservices; starting with Viya 3.3 it also proxies any access to programming components such as CAS Server Monitor and SAS Studio 4. Apache proxies both external connections (coming from a client such as a browser) and internal ones (service-to-service). The following picture shows an external connection (red arrows) form a browser going through the proxy to reach SAS Studio 5. SAS Studio itself then opens an internal connection (blue arrows), for example to talk to the Compute microservice, and the connection is proxied as well.

 

01ViyaConnectionRouting.png

 

Apache HTTP Server proxies every connection to microservices and web applications.

 

Here is a key point to remember: there are no direct internal connections, ever. This is one of the principles followed by SAS R&D in designing Viya and is key for Viya to be ready for cloud environments: Cloud services are expected to be dynamic. Services can move from one machine to another, scaled up by running multiple instances, scaled down by stopping instances, or restarted to recover from failures. Service location and characteristics can change over time as configuration is changed.

 

On a side note, PAAS cloud deployments (such as CloudFoundry Viya deployments) obviously follow the same principle, but the actual implementation is different. CloudFoundry itself provides service discovery and routing as part of its platform services. With BareOS deployments, we had to design and implement this part of the story, too. In this post, we only focus on BareOS.

 

I want to see it!

It's easy to verify how Apache can redirect all connections to the right endpoints. All proxy directives are stored in a specific file, /etc/httpd/conf.d/proxy.conf. Since this is a custom configuration file, upgrading Apache will not overwrite, modify or delete it. Here is an extract from an environment installed by completing the High Availability exercise in the SAS Viya 3.4 Deployment workshop:

 

... more lines ...

# Proxy to SASVisualAnalytics service

  BalancerMember https://intviya01.race.sas.com:44741 route=sasvisualanalytics-10-242-101-97
  BalancerMember https://intviya02.race.sas.com:43381 route=sasvisualanalytics-10-242-92-48
  BalancerMember https://intviya03.race.sas.com:37604 route=sasvisualanalytics-10-242-97-222
  ProxySet scolonpathdelim=on stickysession=JSESSIONID

Redirect /SASVisualAnalytics /SASVisualAnalytics/
ProxyPass /SASVisualAnalytics/ balancer://SASVisualAnalytics-cluster/SASVisualAnalytics/
ProxyPassReverse  /SASVisualAnalytics/ balancer://SASVisualAnalytics-cluster/SASVisualAnalytics/

# Proxy to annotations service

  BalancerMember https://intviya01.race.sas.com:41018 route=annotations-10-242-101-97
  BalancerMember https://intviya02.race.sas.com:46752 route=annotations-10-242-92-48
  BalancerMember https://intviya03.race.sas.com:46823 route=annotations-10-242-97-222
  ProxySet scolonpathdelim=on stickysession=JSESSIONID

Redirect /annotations /annotations/
ProxyPass /annotations/ balancer://annotations-cluster/annotations/
ProxyPassReverse  /annotations/ balancer://annotations-cluster/annotations/

# Proxy to appRegistry service

  BalancerMember https://intviya01.race.sas.com:43809 route=appregistry-10-242-101-97
  BalancerMember https://intviya02.race.sas.com:40535 route=appregistry-10-242-92-48
  BalancerMember https://intviya03.race.sas.com:45921 route=appregistry-10-242-97-222
  ProxySet scolonpathdelim=on stickysession=JSESSIONID

Redirect /appRegistry /appRegistry/
ProxyPass /appRegistry/ balancer://appRegistry-cluster/appRegistry/
ProxyPassReverse  /appRegistry/ balancer://appRegistry-cluster/appRegistry/

... more lines ...

 

Some points that we can understand from this fragment are:

  • Each service in this environment has been clustered and is currently running on 3 hosts.
  • There is a separate "Balancer" per service; this way all services are independent and could be deployed, scaled up/down, started and stopped independently from other ones.
  • Once a client session is established, the parameter stickysession=JSESSIONID keeps it connected to the same instance of the clustered service.

The last point may raise some concerns: sticky sessions could be an issue for High Availability! If a session is always connected, for example, to host #1, what happens if that machine dies? Won't I lose all my work? Here comes to our rescue another key point of the Viya architecture: all services are stateless, i.e. they do not save anything internally. The status of the current session, for example, is saved in the SAS Cache Server. Were host #1 to die, Apache would route the connection to one of the service surviving instances, for example on host #2. That, in turn, would extract the session id from the incoming request and retrieve its status from the external cache. Everything is preserved and end-users do not notice any issue.

 

Back to the theory

Up to now, we have discussed service routing, i.e. how to route a connection from a client to a running service. Apache HTTP Server can do this for us. If we think about this for a moment, we may realize that we have not solved our issue. We simply moved it down one level, from the client to the proxy. How can Apache actually know where services are running? That's the focus of service discovery. To do it, Viya relies on two additional components, and on the way services relate with them. These components are the SAS Configuration Server and the httpproxy service.

 

The SAS Configuration Server, despite its name, is not only a central repository for configuration data, but also the core component for service discovery and service health status.

 

Every time a Viya service is started, it connects to the SAS Configuration Server and registers its name, id, hostname, port, plus additional information. It also registers a check that the SAS Configuration Server performs every few seconds to verify that the service is actually up and responsive. In a similar way, every time a Viya service is gracefully stopped, it connects to the SAS Configuration Server and deletes its registration and any associated health check.

 

Here comes the httpproxy service: its role is to query the SAS Configuration Server for service events and update Apache.

  • When a new service instance starts responding to health checks, httpproxy reads its name, host and port and adds its route to the proxy.conf file we described above. Apache is then forced to re-load its configuration and starts routing client connections to this new service instance.
  • When a service instance is stopped or does not respond to health checks, httpproxy removes the corresponding entry from the proxy.conf file, Apache is forced to re-load its configuration and stops routing client connections to the dead service instance.

Let's check this, too

Viya provides the sas-bootstrap-config command-line utility to interact with the SAS Configuration Server. We can use it to perform service discovery manually, as in the following examples.

  1. We previously started one instance of the audit service. Let's check it:

     

    # define env variables if not already defined
    [[ -z "$CONSUL_HTTP_ADDR" ]] && . /opt/sas/viya/config/consul.conf 
    [[ -z "$CONSUL_TOKEN" ]] && export CONSUL_TOKEN=$(sudo cat /opt/sas/viya/config/etc/SASSecurityCertificateFramework/tokens/consul/default/client.token);
    #discover the audit service
    /opt/sas/viya/home/bin/sas-bootstrap-config catalog service audit
    {
        "items": [
            {
                "address": "192.168.1.1",
                "node": "intviya01.race.sas.com",
                "serviceAddress": "intviya01.race.sas.com",
                "serviceEnableTagOverride": false,
                "serviceID": "audit-10-242-101-97",
                "serviceName": "audit",
                "servicePort": 43206,
                "serviceTags": [
                    "proxy",
                    "rest-commons",
                    "https",
                    "contextPath=/audit"
                ],
                "taggedAddresses": {
                    "lan": "192.168.1.1",
                    "wan": "192.168.1.1"
                }
            }
        ]
    }
    #verify Apache configuration
    grep audit /etc/httpd/conf.d/proxy.conf
    # Proxy to audit service
    Redirect /audit /audit/
    ProxyPass /audit/ https://intviya01.race.sas.com:43206/audit/
    ProxyPassReverse  /audit/ https://intviya01.race.sas.com:43206/audit/
    

     

  2. We then start two additional instances of the audit service. Let's check again:

     

    #discover the audit service
    /opt/sas/viya/home/bin/sas-bootstrap-config catalog service audit
    {
        "items": [
            {
                "address": "192.168.1.1",
                "node": "intviya01.race.sas.com",
                "serviceAddress": "intviya01.race.sas.com",
                "serviceEnableTagOverride": false,
                "serviceID": "audit-10-242-101-97",
                "serviceName": "audit",
                "servicePort": 43206,
                "serviceTags": [
                    "proxy",
                    "rest-commons",
                    "https",
                    "contextPath=/audit"
                ],
                "taggedAddresses": {
                    "lan": "192.168.1.1",
                    "wan": "192.168.1.1"
                }
            },
            {
                "address": "192.168.1.2",
                "node": "intviya02.race.sas.com",
                "serviceAddress": "intviya02.race.sas.com",
                "serviceEnableTagOverride": false,
                "serviceID": "audit-10-242-92-48",
                "serviceName": "audit",
                "servicePort": 34044,
                "serviceTags": [
                    "proxy",
                    "rest-commons",
                    "https",
                    "contextPath=/audit"
                ],
                "taggedAddresses": {
                    "lan": "192.168.1.2",
                    "wan": "192.168.1.2"
                }
            },
            {
                "address": "192.168.1.3",
                "node": "intviya03.race.sas.com",
                "serviceAddress": "intviya03.race.sas.com",
                "serviceEnableTagOverride": false,
                "serviceID": "audit-10-242-97-222",
                "serviceName": "audit",
                "servicePort": 36281,
                "serviceTags": [
                    "proxy",
                    "rest-commons",
                    "https",
                    "contextPath=/audit"
                ],
                "taggedAddresses": {
                    "lan": "192.168.1.3",
                    "wan": "192.168.1.3"
                }
            }
        ]
    }
    
    #verify Apache configuration
    grep audit /etc/httpd/conf.d/proxy.conf
    # Proxy to audit service
      BalancerMember https://intviya01.race.sas.com:43206 route=audit-10-242-101-97
      BalancerMember https://intviya02.race.sas.com:34044 route=audit-10-242-92-48
      BalancerMember https://intviya03.race.sas.com:36281 route=audit-10-242-97-222
    Redirect /audit /audit/
    ProxyPass /audit/ balancer://audit-cluster/audit/
    ProxyPassReverse  /audit/ balancer://audit-cluster/audit/
    

     

  3. The output of the commands above gives us the URL of each service instance. With these, we can manually check the health of each of the cluster members.

     

    curl -k https://intviya01.race.sas.com:43206/audit/commons/health
    {"description":"Composite Discovery Client","status":"UP"}
    curl -k https://intviya02.race.sas.com:34044/audit/commons/health
    {"description":"Composite Discovery Client","status":"UP"}
    curl -k https://intviya03.race.sas.com:36281/audit/commons/health
    {"description":"Composite Discovery Client","status":"UP"}
    
Comments

Very informative blog. Could you help me with below queries,

 

1. how to start multiple instances of a service on a Viya host. Will it be a manual task and whether it can be done at a per service level or it is only limited to host groups being scaled up and down altogether.

 

2. In true cloud context does Viya support scaling as an can we configure Viya to automatically scale up and down based on certain factors like load or resource capacity.

 

3. Could we start multiple instances of services on same host and every time it be a different host.

Version history
Last update:
‎09-17-2018 02:26 PM
Updated by:
Contributors

Ready to join fellow brilliant minds for the SAS Hackathon?

Build your skills. Make connections. Enjoy creative freedom. Maybe change the world. Registration is now open through August 30th. Visit the SAS Hackathon homepage.

Register today!

Free course: Data Literacy Essentials

Data Literacy is for all, even absolute beginners. Jump on board with this free e-learning  and boost your career prospects.

Get Started

Article Tags