BookmarkSubscribeRSS Feed

Understanding Kubernetes concepts to uninstall SAS Viya

Started ‎12-20-2022 by
Modified ‎12-20-2022 by
Views 1,610

The steps to uninstall SAS Viya are clearly documented in the SAS Viya Administration > Deployment Guide. That should definitely be your first stop any time you want to remove SAS Viya from a Kubernetes cluster. The steps are easy to follow and get the job done with minimal fuss. But if you're like me, chances are you either missed a step or perhaps inserted an unnecessary one. If that's happened to you, then you might find that Kubernetes can get hung up and unable to complete deleting the resource(s) you've set it to. So let's take a look at what can go wrong as well as how to prevent and/or correct those problems.

 

We'll begin with explaining a little bit about how Kubernetes deletes objects and then highlight areas where this intersects with uninstalling SAS Viya.

 

Kubernetes Finalizers

 

When we direct Kubernetes to delete a simple object, then we expect to see Kubernetes delete the object as intended, followed by updating its internal etcd database to reflect the change.

 

But that's not always the case.

 

There are some circumstances where the software requires an opportunity to perform some pre-delete actions before the resource is actually deleted. That's where Kubernetes Finalizers come into play. When a finalizer is defined for an object, then that signals Kubernetes that a controller or Kubernetes Operator will first get a chance to delete any dependent objects first. For example, if deleting a PVC, then the Kubernetes controller will first ensure the underlying storage device is released before actually removing the notation of the PVC's existence from etcd.

 

So when we attempt to delete a resource with a finalizer defined, then Kubernetes adds the `deletionTimestamp` to the object. This does a couple of things: 1) it makes the resource read-only (only removing the finalizer attribute is allowed then) and 2) allows the object's controller to notice the change so it can take the pre-delete action.

 

Once the object's controller has finished its pre-delete activity, then it updates the resource to remove the finalizer references. This then allows Kubernetes to proceed with actually deleting the object.

 

Why might finalizers fail?

 

There are many reasons why finalizers might fail. But to keep this post somewhat pithy, let’s look at two simple scenarios.

 

The first scenario is that the finalizer is unable to complete its prescribed task. For example, perhaps a cloud's load balancer isn't deleting cleanly which leaves the finalizer stuck waiting. The second scenario is that no controller associated with the finalizer is present to actually pre-delete anything and subsequently remove the finalizer reference from the object. There are a couple of reasons for this. One is that the controller process might not be running when it's expected to be there. Another is that maybe there's no controller to act on the finalizer as defined (a variation on the first thing). The latter can be used when you want to manually manage pre-deletion activities yourself.

 

A finalizer tag on a resource is really just an arbitrary string - Kubernetes doesn't usually understand the third-party values stored there. When Kubernetes sees a finalizer reference on an object to be deleted, all it does is mark the object as ready for deletion and then it awaits removal of the finalizer reference. It won't delete the object until that finalizer reference is removed, waiting forever in some cases.

 

It's a very simple process that can sometimes lead to complicated outcomes.

 

Finalizers used by SAS Viya software

 

Primarily there's one piece of software included with SAS Viya where finalizers play an important role: internal Crunchy PostgreSQL version 5 (updated from version 4 with SAS Viya stable 2022.10).

 

In the YAML definition below, we can see the finalizer associated with the `postgres-operator`:

 

apiVersion: postgres-operator.crunchydata.com/v1beta1
kind: PostgresCluster
metadata:
  annotations: { snipped } 
  creationTimestamp: '2022-12-15T13:47:17Z'
  finalizers:
    - postgres-operator.crunchydata.com/finalizer
  generation: 1
  labels:
    sas.com/admin: namespace
    sas.com/deployment: sas-viya

In order to successfully uninstall SAS Viya, we need to allow the `postgres-operator` the opportunity to properly perform its pre-delete activities. If we don't, then we will find it difficult to complete the removal of SAS Viya as intended.

 

Part of the challenge with Crunchy in particular is that it works using cluster-wide resources like Custom Resource Definitions (CRD) to instantiate namespace resources. Due to how the scoping works, attempting to uninstall SAS Viya by improperly executing the documented steps out of order can lead to a stalled deletion. Typically, this is the result of a race condition where the `postgres-operator` acting as a finalizer isn't running (or might not even exist anymore) and so the objects it's supposed to clean up aren't updated such that Kubernetes gets stuck in a wait state.

 

There are other software components associated with SAS Viya that also define finalizers, but most of those rely on Kubernetes built-in functionality to get the job done and so don't present much problem.

 

The happy path to uninstall SAS Viya

 

The first rule of uninstalling SAS Viya is to follow the steps documented in the SAS Viya Administration > Deployment Guide.

 

The second rule of uninstalling SAS Viya is to follow the steps documented in the SAS Viya Administration > Deployment Guide.

 

For brevity, I'll hit on the key areas to make the point.

 

# your namespace here
NS=my-sas-viya

# 1. delete the Postgres cluster from the namespace
kubectl -n ${NS} delete postgresclusters --selector="sas.com/deployment=sas-viya"

# 2. delete the namespace (and most SAS Viya assets)
kubectl delete ns ${NS}

# --- (optional) cluster-wide assets ---

# 3. show the CRD provided by SAS Viya
kubectl get crd --selector "sas.com/admin=cluster-wide"
kubectl get crd --selector "sas.com/admin=cluster-api"

# 4. delete the CRD provided by SAS Viya
kubectl delete crd --selector "sas.com/admin=cluster-wide"
kubectl delete crd --selector "sas.com/admin=cluster-api"

# 5. show the clusterroles provided by SAS Viya
kubectl get clusterrole --selector "sas.com/admin=cluster-wide"

# 6. delete the clusterroles provided by SAS Viya
kubectl delete clusterrole --selector "sas.com/admin=cluster-wide"

 

Steps № 1 and 2 will remove the specified instance of SAS Viya software from the Kubernetes cluster (i.e., postgrescluster and namespace).

 

Steps № 3 - 6 are optional. They reference cluster-wide elements that might still be in-use by other SAS Viya deployments in your cluster. So only remove them if they're really no longer needed.

 

Next we'll look at the (mis-)steps to avoid…

 

Do not stop the SAS Viya services first

 

To be clear, the documentation doesn't tell you to stop (i.e., scale down) the SAS Viya services prior to uninstalling. This is one of those steps that an over-zealous SAS administrator (like this author) might add to the process - but it's a bad thing.

 

Remember the `postgres-operator` needs to be running so it can perform its finalizer duties. If it's not, then attempting to uninstall SAS Viya will ultimately lead to a stalled execution of the `kubectl delete` command for the affected resources.

 

If you have already scaled down the SAS Viya services and haven't yet begun the uninstall process, then scale those services back up so they're running as expected.

 

Do not start uninstalling SAS Viya by deleting its namespace

 

Up through stable-2022.09, you were supposed to begin by deleting the SAS Viya namespace. But not anymore.

 

Now we must start by deleting the `postgresclusters` associated with SAS Viya (step № 1 of the Happy Path above). This will ensure that the finalizer process is properly executed to remove custom resources defined and used by postgres in the namespace.

 

If we skip that step and jump ahead to deleting the namespace first, then Kubernetes will delete the `postgres-operator` before it gets its chance to complete the pre-delete activities it's supposed to perform. This will leave the SAS Viya namespace stuck in a Terminating state forever - or until we manually remove the finalizer reference from the namespace's definition (one of the things the `postgres-operator` is expected to do for us).

 

Furthermore, once you've initiated and stalled the namespace deletion, then you cannot fix it by attempting to retroactively delete `postgresclusters`. As we say in the South, "That ship's done sailed." The Crunchy operator that would've handled those protected resources is already gone. So, now it's your responsibility to ensure those pre-delete activities are done and the appropriate finalizer state shown on the affected resources.

 

You broke the uninstall, now fix it

 

So, if like me you've inadvertently gotten hung up with finalizer resources preventing you from uninstalling SAS Viya. Chances are, this might even prevent you from re-installing SAS Viya. Whatever the case, you need to get things working again.

 

The first step is to ensure that any finalizer-bound resources are properly cleaned up. This will prove tricky if you don't know what those steps are. You might try cracking open the code of the operator responsible to suss out those actions and then execute them.

 

Sidebar: You might think that you don't care because maybe this environment was only used for testing/demo purposes. Or maybe this deployment is part of an automated pipeline to temporarily spin up an environment, get the desired results, and then blow it away. But keep in mind that deleting something from Kubernetes' awareness doesn't guarantee the physical items are really gone. Kubernetes might not be aware of the pod any more, but the process could still be running in a VM on an instance in the cloud - continuing to charge money to your organization. There are similar considerations for device storage, load balancers, and so on.

 

The next step after properly cleaning up those resources is to patch out the finalizer reference on the those objects so that Kubernetes can proceed with its part of the deletion process.

 

Sidebar: Removing the finalizer references first is an effective way to let Kubernetes complete deletion of the affected objects. This is sometimes referred to as the "nuclear option" because it arbitrarily removes the finalizer reference without performing the due diligence it implies.

 

Force deletion of the SAS Viya namespace

 

Fortunately, the fix for dealing with failed execution of the `postgres-operator.crunchydata.com/finalizer` is to remove the finalizer reference from the namespace. The code below essentially implements the "nuclear option":

 

# your namespace here
NS=my-sas-viya

# get the namespace definition
kubectl get namespace ${NS} -o json >/tmp/${NS}.json

# remove the "kubernetes" line from the finalizers spec
sed -i '/kubernetes/d' /tmp/${NS}.json

# put the namespace definition without finalizer into effect
kubectl replace --raw "/api/v1/namespaces/${NS}/finalize"-f/tmp/${NS}.json

# clean up
rm /tmp/${NS}.json

 

Simply put, this script finds all instances of the word "kubernetes" in the namespace definition's JSON. Specifically, it'll find several entries similar to:

 

"apiVersion": "v1",
"kind": "Namespace",
"metadata": {
    "creationTimestamp": "2022-12-13T21:55:08Z",
    "labels": {
        "kubernetes.io/metadata.name": "my-sas-viya"
    },
    "name": "lab",
    "resourceVersion": "3944",
    "uid": "4d15d2f4-95df-4d41-b12b-ce951d3ff7a2"
},
"spec": {
    "finalizers": [
        "kubernetes"
    ]
},
    "status": {
        "phase": "Active"
}
... ... ...

 

The script removes the "kubernetes" value from all of the finalizer specs it finds. With no finalizer defined, then Kubernetes interprets the change to mean it can now delete the resource. In this case this allows Kubernetes to successfully terminate the SAS Viya namespace.

 

If you're running a `kubectl delete ns` command and it was hung up, then you should see it complete after running the script above (probably in a different terminal window).

 

Bonus finalizer!

 

But wait, there's more! Remember earlier where I mentioned that Crunchy defines cluster-wide resources? In particular, there's the `postgresclusters.postgres-operator.crunchydata.com`. And in normal operation, it doesn't have a finalizer defined for it.

 

In a normal running SAS Viya environment, we ask for CRD and their associated finalizers:

 

$ kubectl get crd -o custom-columns=Kind:.kind,Name:.metadata.name,Finalizers:.metadata.finalizers

 

And see:

 

Kind                       Name                                                  Finalizers
CustomResourceDefinition   postgresclusters.postgres-operator.crunchydata.com    < none >
{additional results not shown}

 

But now consider this from the perspective of a failed uninstall that we're trying to recover from. We've essentially directed Kubernetes to remove the custom resources (as part of the namespace delete) which are described by this CRD. So Kubernetes applies its own finalizer to the CRD to ensure it's only removed after its dependent resources have been deleted first.

 

Now it's got a finalizer that wasn't there before:

 

Kind                       Name                                                  Finalizers
CustomResourceDefinition   postgresclusters.postgres-operator.crunchydata.com    [customresourcecleanup.apiextensions.k8s.io]
{additional results not shown}

While an interesting result, now we're stuck again. Deleting the CRD cannot finish because there's a finalizer associated with it - and that finalizer isn't able to do its job.

 

Since we already know the custom resources in the SAS Viya namespace have been deleted at this point, then we know it's okay to remove the finalizer reference from the CRD:

 

$ kubectl patch crd/postgresclusters.postgres-operator.crunchydata.com -p '{"metadata":{"finalizers":[]}}' --type=merge

 

With that removed, then Kubernetes can delete the CRD, too. Similar to the namespace deletion above, if you're running a `kubectl delete crd` command and it was hung up, then you should see it complete after running the patch above (probably in a different terminal window).

 

One more thing

 

If you don't get that CRD deleted because it remains stuck in a Terminating state, and then you attempt to re-install SAS Viya to the Kubernetes cluster, you might hit this error message:

 

Error from server (MethodNotAllowed): error when creating "/work/deploy/manifest.yaml": create not allowed while custom resource definition is terminating

Now that you've read this post, hopefully the error message makes sense. Trying to reinstall Viya while some of the old components are stuck in a Terminating state won't be successful. When you see that, removing the associated finalizer reference - ideally after ensuring the pre-delete tasks that it implies have been completed - should allow the resource to delete from Kubernetes. Then re-installing SAS Viya will proceed as planned.

 

Wrapping up

 

The use of Kubernetes Finalizers is intended to help protect an environment from the accidental deletion of resources to ensure smooth operation and clean up. As we've seen, the implementation of finalizers involves trade-offs in terms of flexibility and simplicity. Understanding where these considerations come into play when uninstalling SAS Viya should help if deletion of resources cannot complete normally.

 

I'd also like to thank my colleague Christian Provenzano for his technical insights on this subject.

 

Find more articles from SAS Global Enablement and Learning here.

Version history
Last update:
‎12-20-2022 10:54 AM
Updated by:
Contributors

sas-innovate-2024.png

Don't miss out on SAS Innovate - Register now for the FREE Livestream!

Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.

 

Register now!

Free course: Data Literacy Essentials

Data Literacy is for all, even absolute beginners. Jump on board with this free e-learning  and boost your career prospects.

Get Started