BookmarkSubscribeRSS Feed

Retaining your SAS Viya Backup

Started ‎11-18-2022 by
Modified ‎11-18-2022 by
Views 2,096

Backups are critical in Enterprise Software. In SAS Viya a good backup is key to protecting your environment and supporting the restoration of lost content or a complete environment in the event of a disaster. In a previous post, I covered how to perform a Backup and Restore in Viya 4. You cannot do a restore without a backup package. In this post, I will look at how you ensure your backup package is available when you need it.

 

Review of Backup and Restore

 

Firstly, let’s review the key aspects of Viya 4 Backup and Restore (for more details see the previous post.) In Viya

  • Backup and Restore are implemented using Kubernetes Jobs
  • A Backup includes:
    • Content stored in the Infrastructure Data Server​
    • Configuration stored in the SAS Configuration Server​
    • CAS permstore ​
    • CAS persistent volumes
  • Backups by default run weekly on Sunday at 1:00 am(UTC)​ (but can be performed ad-hoc)

The Backup Package

 

The result of a backup is a backup package. Backup packages are distinguished by a unique Backup ID amd are written to and read from two Kubernetes Persistent Volumes​ sas-cas-backup-data and sas-common-backup-data.

 

sas-common-backup-data has a sub-directory named for the unique id of the backup (Backup ID) and under that directory sub-directories for:

 

  • consul: the backup of the Configuration Server​
  • postgres: the backup of the Infrastructure Data Server​

 

gn_viya4_backup02.png

Select any image to see a larger version.
Mobile users: To view the images, select the "Full" version at the bottom of the page.

 

sas-cas-backup-data has a sub-directory named for the unique id of the backup (BACKUP ID and under that sub-directories for:

 

  • cas contains a sub-directory for each CAS server, and that sub-directory contains the CAS permstore for that server. The permstore contains the caslib definitions, authorization, etc.​
  • filesystem contains a sub-directory for each CAS server, within the sub-directory is a caslibs directory that contains the data from the default caslibs, e.g. public, samples, systemdata, vamodels, etc.​

 

gn_viya4_backup03.png

 
Storage in Kubernetes

 

As we see the backup is stored on a Persistent Volume in the Kubernetes cluster. Let’s review Persistent Volumes(PV), Persistent Volume Claims(PVC), and Storage Classes.

 

A Persistent Volume is a piece of storage in a Kubernetes cluster. Ignoring a lot of details :), Persistent Volumes are (mostly) dynamically provisioned in a cluster when a POD makes a Persistent Volume Claim. A PVC is a request for storage with a specific type and configuration. Kubernetes looks for a PV that meets the criteria defined in the PVC, and if there is one, it matches the claim to PV. StorageClass allows dynamic provisioning of Persistent Volumes when a PVC makes a claim.

 

An important attribute of a PV is its reclaimPolicy. The reclaimPolicy determines what happens to the content of a PV when there is no longer any “claim” on it. There are two possibilities for the reclaimPolicy:

 

  • DELETE: deletes the physical volume.
  • RETAIN: keeps the physical volume allowing manual reclamation of the data.

 

PV’s that are dynamically created by a StorageClass will have the reclaim policy specified in the reclaimPolicy field of the class. If no reclaimPolicy is specified when a StorageClass object is created, it will default to DELETE. The reclaimPolicy applies to the persistent volumes not to the storage class itself. PVs and PVCs that are created using the StorageClass inherit its reclaim policy. Typically the reclaim policy is DELETE.

 

The SAS Viya documentation says here:

 

Many storage classes have a reclaimPolicy that is set to delete by default. If you delete a namespace that includes such storage classes, the PVCs in that namespace are deleted. If the reclaimPolicy is Delete, the corresponding persistent volumes (PVs) are also deleted, resulting in data loss.

 

Specifically for the Backup Volumes, it is recommended to change the reclaimPolicy to RETAIN. So how do we do that?

 

Change the reclaimPolicy of the Backup Persistent Volumes

 

There are two places we can change the reclaimPolicy:

  • Storage Class
  • Persistent Volumes​

 

In the Storage Class definition, you can set the reclaimPolicy to RETAIN instead of Delete. When the storage class ReclaimPolicy is RETAIN, by default, all persistent volumes will inherit the policy of RETAIN.​ In our case we don't want that, we want Kubernetes to clean up the majority of our Volumes automatically, just not the two Backup Volumes. Let’s look at how to patch just the Backup Persistent Volumes so that their ReclaimPolicy is RETAIN.

 

Firstly, let’s view the Backup Persistent Volumes.

 

kubectl get pvc -l 'sas.com/backup-role=storage'

 

gn_4_reclaimpolicy_blog_01.png

 

The output shows that for the two backup PVC’s two volumes were provisioned by the Storage Class nfs-client. Both volumes are RWX (the volume can be mounted as read-write by many nodes.). Let’s look at the details of the CAS backup volume.

 

kubectl describe pv $(kubectl get pvc sas-cas-backup-data -o jsonpath='{.spec.volumeName}')

 

gn_5_reclaimpolicy_blog_02-1.png

 

Notice that the reclaimPolicy, inherited from the storage class is DELETE. For the backup data, it would be more appropriate to have a Reclaim policy of RETAIN. The two Persistent Volumes were dynamically provisioned from the Storage Class. In order to update the ReclaimPolicy we must patch both Backup volumes, setting spec.persistentVolumeReclaimPolicy to RETAIN. Fortunately, it is pretty easy to do. The commands below firstly, get the name of the volumes that are bound to the PVC's and then patch the volumes to change the reclaimPolicy.

 

# get the name of the volume that is bound to the PVC
casbackvolname=$(kubectl get PVC sas-cas-backup-data -o jsonpath='{.spec.volumeName}')
commonbackvolname=$( kubectl get PVC sas-common-backup-data -o jsonpath='{.spec.volumeName}')

# patch the volume to change spec.persistentVolumeReclaimPolicy
kubectl patch PV ${casbackvolname} -p '{"spec":{"persistentVolumeReclaimPolicy":"Retain"}}'
kubectl patch PV ${commonbackvolname} -p '{"spec":{"persistentVolumeReclaimPolicy":"Retain"}}'

 

gn_6_reclaimpolicy_blog_03.png

 

If we view the Persistent Volumes again, we can now see the reclaimPolicy is set to RETAIN for the two Backup PV's, meaning that in the event of an issue even as big as a delete of the Viya namespace, the backup package can be recovered.

 

kubectl get pv | grep Retain

 

gn_7_reclaimpolicy_blog_04-1024x54.png

 

NOTE: Because Kuberenetes will no longer automatically clean up these volumes the Viya administrator should make sure any data no longer needed on the volume is deleted.

 

Another approach to ensuring that your Backup Package is preserved is to archive the backup from the persistent volumes to storage outside the cluster. In this option, the backup package is copied to storage that is not inside the cluster and can be retrieved from that location if necessary. I will take a look at how you can do this in a subsequent post.

 

More Information

 

In this post, I showed you an easy way to make sure that your Viya Backup is available when you need it. For additional information on Viya Backup and Restore check out these resources:

 

 

Find more articles from SAS Global Enablement and Learning here.

Version history
Last update:
‎11-18-2022 11:15 AM
Updated by:
Contributors

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

Free course: Data Literacy Essentials

Data Literacy is for all, even absolute beginners. Jump on board with this free e-learning  and boost your career prospects.

Get Started