BookmarkSubscribeRSS Feed

How-to: Modifying /etc/hosts for a SAS Viya deployment

Started ‎11-23-2020 by
Modified ‎11-23-2020 by
Views 12,853

We've been doing our bit in preparing for SAS Viya 2020.1 and beyond. There's been a lot to learn about Kubernetes in general and specifically for configuring Viya 4 and the abstraction layers it relies on to achieve the goals as desired. One item that ran us down a rabbit hole is making changes to the /etc/hosts file to put in custom hostname aliases.

 

If you're already a Kubernetes expert administrator, then this post won't help you. However if, like me, you're putting on your first pair of k8s boots and stumbling out into the real world, then hopefully you'll gain a bit more insight into how SAS Viya works within Kubernetes.

 

Alternate titles for this post:

  • Easy things made hard
  • Why, oh sweet baby Linus, why?
  • 2020 is making life difficult. Why should your Viya deployment be any different?
  • YAML: you ain't my language

Credit to my colleague Erwan Granger for that last one. 😜


Why modify /etc/hosts?

I suspect most of you already know why, but let's make it clear. Your host OS is probably configured to refer to the /etc/hosts file first when attempting to resolve a hostname to an IP address. If nothing matches in /etc/hosts, then the network DNS is tried. That's not a universally guaranteed setup, but it's typical. This approach provides flexibility to override and/or supplement the network DNS for hostname resolution on a host-by-host basis.

 

By default, the /etc/hosts file usually just provides resolution for "localhost" to point to IP address 127.0.0.1 (and IPv6 ::1). But we can add more lines to the file, each specifying an IP address along with one or more hostname aliases (referred to from here on as "host aliases").

 

We want to add lines like this when hostnames we want to refer to aren't known by the DNS. This can happen for many possible reasons, including:

  • Alternative hostnames
  • Hostnames which are purposely excluded from DNS
  • Hosts on other subnets that you have a route to, but which the default DNS doesn’t handle
  • And so on

The challenge with using /etc/hosts is that you are responsible for keeping it maintained and up-to-date on every host where you rely on it. That can be a large and tedious task if we aren't careful. That's why DNS exists as a central, single-point-of-contact for network resolution.

 

Of course, the /etc/hosts file isn't meant to be a panacea to resolve your network routing issues. But sometimes it is the exact right place to put what you need.

Don't do this

Normally on a machine where you have sufficient admin privileges in the OS, you'd simply edit the /etc/hosts file to add new lines for your desired host aliases.

 

In Kubernetes, we refer to the host machines as "nodes". So for a Viya 4 deployment, you might think you should just logon to the nodes' OS directly, modify /etc/hosts, make a quick validation that you can ping the host alias as desired, slap your hands together signifying a job well done, then head off to the pub (hey, it's 5 o'clock somewhere).

 

The problem with this approach is that while the nodes will see this change, the Kubernetes pods and their constituent containers won't. The containers have their own /etc/hosts file. You need to make the change there.

 

Now how many pods is Viya 4 currently running? It could be a number ranging from 120 to over 350 depending on the product mix. Remember what I said about how tedious it'd be to manually maintain /etc/hosts across a lot of machines? Yeah. Boom.

 

We must think like large-scale enterprise hosting admins. When you're running Kubernetes, that's exactly what you are, even for just a single deployment of Viya 4.

Kubernetes knows where everything is

The various services for Viya 4 are divvied up to run in containers. There are so many containers that we rely on Kubernetes to provide a framework for their administration and operation. That means that the containers for Viya are placed into Kubernetes pods. And then Kubernetes deploys those pods to the various nodes (host machines) which are provisioned for the Viya workloads.

 

Having to declare the unique manifests for all individual pods is too fine-grained for most use cases. Many pods have a shared set of configuration options (like ReplicaSets) So there's a concept of deployment "kind" used to define pods with shared characteristics. Some default kinds already fit well with Viya services. And for others, we've defined our own kinds.

  • CASDeployment
    Defines pods for the SAS Cloud Analytics Service containers
  • CronJob
    For jobs that start on a schedule (as pods). For Viya, this is currently just for backups and the sas-update-checker.
  • DaemonSet
    This defines pods where one-and-only-one is needed on each node. For Viya, currently home to a utility pod which pre-pulls large container images for Viya to reduce end-user wait time.
  • Deployment
    The default Kubernetes kind of pod deployment definition. Used for the microservices and similar Viya infrastructure
  • PodTemplate
    Used to spawn off new jobs (as pods) on demand with a limited lifespan. For example, when you logoff SAS Studio, your launcher pod will shutdown.
  • StatefulSets
    Defines pods for the stateful services which have unique deployment considerations (like SAS Management Server, SAS Message Services, SAS Data Infrastructure Server, etc.)

So now we need a way to direct Kubernetes to modify the /etc/hosts file for these various kinds of pod deployments.

Kustomize with a k

Every pod is defined with a manifest, which is a plain-text file that describes the pod's attributes (which container to run, storage to attach, network interface to bind, etc.). And we can use that to define new lines for /etc/hosts if we want to. But again, with hundreds of pods in a Viya 4 deployment, this isn't where we want to be.

 

Enter the tool known as Kustomize which is the native configuration tool for Kubernetes. The general idea is that you can use (and re-use) standard templates for your pods as a starting point and then extend them further with overlay templates. From a SAS perspective, this allows us to ship a standard set of software manifests for Viya to customers which we can configure and extend using site-specific overlay templates.

 

Kustomize relies on YAML to describe things… and so let's look closer at what we want to do.

It's YAML writing time

Let's take a look at the YAML syntax needed to add new lines to the /etc/hosts file of select pods in our Viya deployment.

 

First of all, we'll create a new plain-text file and place it in a sub-directory named for easy reference on your Kubernetes control host. Something like: /path/to/project/<Viya Namespace>/site-config/network/etc-hosts_addendum.yaml.

 

Specifically, we define a Kustomize patch transformer which is used to replace and/or remove content from the original manifest definitions. Here we go defining a hostAlias patch:

 

---
apiVersion: builtin
kind: PatchTransformer
target:
  kind: CASDeployment
metadata:
  name: etc-hosts-cas
patch: |-
  - op: add
    path: /spec/controllerTemplate/spec/hostAliases
    value:
      - ip: "10.96.1.1"
        hostnames:
          - "my-a-host"
      - ip: "10.96.2.2"
        hostnames:
          - "my-b-host.customer.com"
          - "my-b-host"

 

This patch transformer will insert (or replace) the hostAlias definition in the site.yaml file for pods that match "target: kind: CASDeployment". That's right - we're using YAML in this file to create more YAML in a different file. When followed through to completion, the desired result will be that each container in those pods will have the following lines added to their /etc/hosts files:

 

10.96.1.1  my-a-host
10.96.2.2  my-b-host.customer.com  my-b-host

 

Depending on the spec of the target, there may be some error checking (like IP addresses must be all numeric) or none at all. Ultimately, it's on you to ensure that those IP addresses and host aliases are correct and intended for the target pods.

 

Other directives of note in the YAML above:

  • "metadata: name: etc-hosts-cas"
    A convenient label describing the purpose of this patch transformer. Make sure the name is unique! If it's not, then later on when you try to build with Kustomize, it will <sarcasm>helpfully</sarcasm> explain:

     

    Error: accumulateFile "accumulating resources from 'site-config/network/etc-hosts_addendum.yaml': may not add resource with an already registered id: ~G_builtin_PatchTransformer|~X|etc-hosts-deployment", loader.
    
    New "Error loading site-config/network/etc-hosts_addendum.yaml
    with git: url lacks host: site-config/network/etc-hosts_addendum.yaml, dir: got file 'etc-hosts_addendum.yaml', 
    but 
    '/usr/csuser/clouddrive/project/deploy/lab/site-config/network/etc-hosts_addendum.yaml' must be a directory to be a root, get: invalid source string: site-config/network/etc-hosts_addendum.yaml"
    
  • "op: add"
    Indicates the addition of new lines, not replacing or removing
  • "path: /spec/controllerTemplate/spec/hostAliases"
    This is not a physical directory path in the OS. It's a reference to the hierarchical structure which organizes the site.yaml file that defines Viya 4's deployment. And hostAliases is a construct used for adding lines to /etc/hosts.

We can add more patch transformer definitions to this same file to drive similar changes to /etc/hosts files in other Viya pods. To make this same change on all Viya 4 pods, we need apply several patch transformers. Create a file on your Kubernetes controller under site-config for your Viya deployment, something like: /path/to/<Viya Namespace>/site-config/network/etc-hosts_addendum.yaml.

 

---
apiVersion: builtin
kind: PatchTransformer
target:
  kind: CASDeployment
metadata:
  name: etc-hosts-cas
patch: |-
  - op: add
    path: /spec/controllerTemplate/spec/hostAliases
    value:
      - ip: "10.96.1.1"
        hostnames:
          - "my-a-host"
      - ip: "10.96.2.2"
        hostnames:
          - "my-b-host.customer.com"
          - "my-b-host"
---
apiVersion: builtin
kind: PatchTransformer
target:
  kind: DaemonSet
metadata:
  name: etc-hosts-ds
patch: |-
  - op: add
    path: /spec/template/spec/hostAliases
    value:
      - ip: "10.96.1.1"
        hostnames:
          - "my-a-host"
      - ip: "10.96.2.2"
        hostnames:
          - "my-b-host.customer.com"
          - "my-b-host"
---
apiVersion: builtin
kind: PatchTransformer
target:
  kind: Deployment
metadata:
  name: etc-hosts-deployment
patch: |-
  - op: add
    path: /spec/template/spec/hostAliases
    value:
      - ip: "10.96.1.1"
        hostnames:
          - "my-a-host"
      - ip: "10.96.2.2"
        hostnames:
          - "my-b-host.customer.com"
          - "my-b-host"
---
apiVersion: builtin
kind: PatchTransformer
target:
  kind: PodTemplate
metadata:
  name: etc-hosts-job
patch: |-
  - op: add
    path: /template/spec/hostAliases
    value:
      - ip: "10.96.1.1"
        hostnames:
          - "my-a-host"
      - ip: "10.96.2.2"
        hostnames:
          - "my-b-host.customer.com"
          - "my-b-host"
---
apiVersion: builtin
kind: PatchTransformer
target:
  kind: StatefulSet
metadata:
  name: etc-hosts-statefulset
patch: |-
  - op: add
    path: /spec/template/spec/hostAliases
    value:
      - ip: "10.96.1.1"
        hostnames:
          - "my-a-host"
      - ip: "10.96.2.2"
        hostnames:
          - "my-b-host.customer.com"
          - "my-b-host"
---
apiVersion: builtin
kind: PatchTransformer
target:
  kind: CronJob
metadata:
  name: etc-hosts-cj
patch: |-
  - op: add
    path: /spec/jobTemplate/spec/template/spec/hostAliases
    value:
      - ip: "10.96.1.1"
        hostnames:
          - "my-a-host"
      - ip: "10.96.2.2"
        hostnames:
          - "my-b-host.customer.com"
          - "my-b-host"

 

Of course, if you only need the new host aliases for certain kinds of pods - such as defining a remote DBMS for CAS to work with using SAS/ACCESS data connectors - then only provide patch transformers that are really needed to do the job. Keeping these changes small in scope helps reduce the chance of unexpected implications later.

Tell Kustomize what we did

So we've created this new YAML file, but that's not enough. We need to tell Kustomize where to find it when building the site.yaml file for Viya 4 deployment.

 

We do that by editing the base kustomization.yaml file for our project in /path/to/<Viya Namespace>. Look for the "transformers:" section and add a reference to our etc-hosts_addendum.yaml file at the end:

 

transformers:
  - sas-bases/overlays/required/transformers.yaml
  - sas-bases/overlays/external-postgres/external-postgres-transformer.yaml
  - site-config/network/etc-hosts_addendum.yaml

 

Now we're ready for Kustomize to take this change we want and build the site.yaml file for the Viya deployment. So from the same directory as the base kustomization.yaml file, go ahead and build the updated site.yaml:

 

$ cd /path/to/<Viya Namespace>
$ kustomize build -o site.yaml

 

If it runs successfully, there won't be any output, but you will see that the timestamp of the resulting site.yaml file will be updated as its content has changed.

 

At this point, we can peek inside the site.yaml file. In my small Viya 4 deployment, I have 126 host alias definitions for the various Viya pods. Yours might have many more. For pods where no new host aliases are applied, you'll see it defined as an empty sequence: "hostAliases: []".

Put it into action

If everything checks out and looks right so far, then we're ready to make these changes take effect. This is done by directing Kubernetes to apply the site.yaml file to the cluster:

 

$ # as a k8s user with sufficient privileges in the Viya namespace
$ kubectl apply -n <Viya Namespace> -f site.yaml

 

Assuming all is well, this command will generate several status lines of text as it processes site.yaml and finish. But it is possible to have created a site.yaml file which is syntactically correct in terms of the YAML layout, but which kubectl might still have a problem with.

 

The problems kubectl might complain about at this point should refer to not understanding the intent of the new YAML we added… where the YAML is parseable, but the result that Kubernetes is trying to apply to hostAliases might not make sense. Think of it as someone calling you by your name in reverse order. The pieces are all there, but you don't get the desired appellation.

 

For "normal" pods - which comprise most of a Viya deployment - successfully applying the site.yaml is all you need to do to put the new host aliases into effect. But some pods aren't "normal", requiring operators to refer to pod templates to direct the instantiation of pods with the change. CAS in particular is one of these.

 

To get CAS to pick up the host aliases, we need to effectively stop and restart it.

 

$ kubectl delete pod --selector='app.kubernetes.io/managed-by=sas-cas-operator'

 

Because Kubernetes is configured to maintain a specific number of pods running for CAS, then deleting the current pods causes Kubernetes to start up new ones to replace them. It’s the classic Ship of Theseus approach to software management philosophy first postulated by Plato and Heraclitus over 2,000 years ago.

Validate that it works

You're probably wondering if these changes really worked. Or you should be. With this many levels of redirect and abstraction, anything is possible. We created a new YAML file which defined override parameters for reference when Kustomize assembled the final site.yaml file. Then site.yaml was applied to our cluster by kubectl, however we had to then direct Kubernetes to restart the CAS-specific containers because they're managed by an operator and aren't immediately affected by applying site.yaml like other Viya pods.

 

Still with me? We're in the home stretch. Just hold on a little longer.

 

The test is pretty simple: run a command inside a container to check the /etc/hosts file for the new aliases. Something like:

 

$ kubectl exec --stdin --tty sas-logon-app-65b6795f7f-6b94v -- cat /etc/hosts

 

and for pods with multiple containers:

 

$ kubectl exec --stdin --tty sas-cas-server-default-controller -- container cas -- cat /etc/hosts 

 

If you find that CAS doesn't show the new host aliases in /etc/hosts, remember to direct Kubernetes to restart CAS by deleting its pods.

 

Now those two commands are good for spot-checking, but for a fully comprehensive validation, you'll need to get the full list of pods for your deployment and then loop through them all to test.

 

Like this:

 

#!/bin/bash

# 
# loop thru the pods of your Viya namespace and test for the "10.96" string
# of the new host alias IP address. Modify per your site.
#

NS="<Viya Namespace>"
PODS=`kubectl get pods -n $NS | awk '{ if(NR>1)printf $1" " }'`

for p in $PODS
do

  kubectl exec --stdin --tty "${p}" -- /bin/grep -q 10.96 /etc/hosts && echo " PASS: ${p}" || echo -e "\n-- FAIL: ${p}\n"

done

 

After applying the changes from site.yaml to the cluster, it might take several minutes before all pods will respond with a "PASS" when running this validation script. Re-run it to see if the number of PASSes increases with each run.

Wrapping up

I hope you found this exercise interesting. For me, it helped to clarify the various layers of a SAS Viya configuration and which utilities are responsible to get it done. We've shared this topic with the Viya dev teams for their input... and they acknowledge they might be able to provide a simpler approach in the future. In the meantime, when you don't have the ability to make a change to your site's DNS, then use the approach described here to make the desired changes to the /etc/hosts files when needed.

H/T

Special thanks to my GEL colleague Erwan Granger, Advisory Technical Architect, as well as David Page, Distinguished Software Developer in DevOps Engineering R&D, for their collaboration and review of this post. Any mistakes are mine, not theirs.

Comments

Hi,

 

Good Day. Wanted to ask, what if I have a sas-microanalytic pod (MAS)? 

What would the kind be?

 

Thanks

 

Update:

I tried to use kind: Deployment, seems worked successfully.

Just to add. If the server connecting to e.g. Oracle Database is Load Balance. will multiple IP address with the same hostname work in /etc/hosts of the pods? Like:

123.123.123.123 A

123.123.123.124 A

123.123.123.125 A

 

Or this setup will require a DNS?

 

Thanks

Alko13,

 

Sorry, I'm not familiar with Oracle load balancing concepts specifically. But I as far as the /etc/hosts file goes, you cannot have multiple IP addresses that reference the same hostname alias. In your example above, the hostname "A" is associated with 3 IP addresses. That's unlikely to work like you might want.

 

You can, on the other hand, have multiple hostname aliases for a single IP address (shown in my examples above).

 

I recommend conferring with your Oracle administrator to determine the route(s) they expect and support for you to reach the Oracle database. I doubt you'll be expected to modify /etc/hosts in the SAS Viya pods and they should give good instruction about network routing to reach the Oracle load balancer.  

Version history
Last update:
‎11-23-2020 09:18 AM
Updated by:

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

Free course: Data Literacy Essentials

Data Literacy is for all, even absolute beginners. Jump on board with this free e-learning  and boost your career prospects.

Get Started