SAS Viya stable 2023.10 introduces a simplified approach to restoring a Viya backup. In prior releases, an administrator was required to build and apply temporary manifests to initiate a restore. In 2023.10 (and forward) this is no longer the case. In this post, I will review the new and improved functionality.
In a previous post, I provided an overview of SAS Viya 4 Backup and Restore. To summarize, a Viya backup includes:
Viya backup and restore are implemented using native Kubernetes functionality. Backup and restore are implemented as Kubernetes jobs and cronJobs. The restore process has two high-level steps.
So what has changed? In 2023.10 the restore process performs the same two steps, however, the method of initiating a restore has improved. Until 2023.10 the restore process involved building and applying temporary Kubernetes manifests to perform the two steps; running the Restore Job and restarting CAS in RESTORE mode. After restoring the backup was complete the manifests that manage the deployment had to be reset to their original state and the temporary manifests discarded. The restore process was not particularly easy for the Viya administrator. The new process does not require the creation of temporary manifests. Let's see how it works now.
Each individual backup is identified using a timestamp value called the backup ID. In the command below, we retrieve the backup ID of the backup to restore from the ad-hoc backup job that created the backup.
backupid=$(yq4 eval '(.metadata.labels."sas.com/sas-backup-id")' <(kubectl get job sas-scheduled-backup-job-adhoc-001 -o yaml))
echo ${backupid}
Output:
2023-11-09T15_32_28_628_0700
When selecting a backup to restore you should always check the status.json in the sas-common-backup-data PVC directory to be sure that the backup you are restoring was completed successfully. The status.json contains detailed information about the backup. For a successful backup, the file should have the value: sas.com/sas-backup-job-status: Completed.
The sas-restore-job-parameters configMap is how we pass settings to the restore process. The configMap will be referenced by the restore job and by the CAS Server when it starts. The two parameters to set for a restore are:
The patch command patches the configMap updating the values of SAS_BACKUP_ID and SAS_DEPLOYMENT_START_MODE:
restore_config_map=$(kubectl describe cronjob sas-restore-job | grep -i sas-restore-job-parameters | awk '{print $1}'|head -n 1)
echo The current restore Config Map is: $restore_config_map
kubectl patch cm $restore_config_map --type json -p '[ {"op": "replace", "path": "/data/SAS_BACKUP_ID", "value":"'${backupid}'"}, {"op": "replace", "path": "/data/SAS_DEPLOYMENT_START_MODE", "value":"RESTORE" }]'
Output:
The current restore Config Map is: sas-restore-job-parameters-bm48bd82bg
configmap/sas-restore-job-parameters-bm48bd82bg patched
Using the following command we can view the updated configMap and make sure that the SAS_BACKUP_ID and SAS_DEPLOYMENT_START_MODE parameters are correctly set.
kubectl describe cm $restore_config_map
Select any image to see a larger version.
Mobile users: To view the images, select the "Full" version at the bottom of the page.
With the parameters set correctly, we can now start the Restore Job from the Restore cronJob. This process will restore the SAS Infrastructure Data Server and the SAS Configuration Server. In addition, it will stop the CAS server which is a pre-requisite for the second step which will restore the CAS server.
kubectl create job --from=cronjob/sas-restore-job sas-restore-job
Output:
job.batch/sas-restore-job created
You can view the job log as it runs with this command, using the -f parameter to stream the log to the screen.
kubectl logs -l "job-name=sas-restore-job" -f -c sas-restore-job | gel_log
Note: in the previous command we pipe to a custom function(gel_log) to reformat the log from JSON to a more human-readable format. The function is shown below for your information.
gel_log () {
jq -R -r '. as $line | try (fromjson| "\(.level | ascii_upcase) \(.timeStamp) '['\(.source)']-' \(.message) " ) catch $line '
}
You can check the status of a restore job with this command. The status should eventually change from "Running" to "Completed".
kubectl get jobs -l "sas.com/backup-job-type=restore" -L "sas.com/sas-backup-id,sas.com/backup-job-type,sas.com/sas-restore-status"
Output:
NAME COMPLETIONS DURATION AGE SAS-BACKUP-ID BACKUP-JOB-TYPE SAS-RESTORE-STATUS
sas-restore-job 0/1 42s 42s 2023-11-16T19_48_28_628_0700 restore Running
To make sure the job has run successfully check the log for the message "restore job completed successfully."
kubectl logs -l "job-name=sas-restore-job" -c sas-restore-job --tail 1000 | gel_log | grep "restore job completed successfully" -B 3 -A 1
The restore job will perform a rolling start of many of the SAS Viya Pods. Before moving on to the next step we should check that two of the key PODS (SAS logon and Configuration) are up and running.
kubectl get pods -l app=sas-logon-app
kubectl get pods -l app=sas-configuration
Output:
[cloud-user@pdcesx11133 from35]$ kubectl get pods -l app=sas-logon-app
NAME READY STATUS RESTARTS AGE
sas-logon-app-5dcb9df44d-hb8gd 1/1 Running 0 2m26s
[cloud-user@pdcesx11133 from35]$ kubectl get pods -l app=sas-configuration
NAME READY STATUS RESTARTS AGE
sas-configuration-68b558b8b9-g7dg4 1/1 Running 0 2m46s
With the restore job completed the second step will restore the CAS server. The restore job has stopped all CAS Servers in the environment. To restore CAS, the CAS Server will be started in RESTORE mode and data and configuration will be restored during server startup. Firstly, let's check that CAS is not running.
kubectl get pods --selector="casoperator.sas.com/server==default" -n gelcorp
Expected Output:
No resources found in gelcorp namespace.
To replace the old manifest approach two new scripts are now used to initiate the CAS restore: The scripts are delivered with the deployment assets in the directory sas-bases/examples/restore/scripts. The two scripts are:
The scripts are delivered with the deployment assets in the directory sas-bases/examples/restore/scripts. To run the scripts, we need to make them executable.
chmod +x ~/project/deploy/${current_namespace}/sas-bases/examples/restore/scripts/*.sh
The restoration of the data to the two CAS file PVCs requires a clean volume. Run the sas-backup-pv-copy-cleanup.sh script to clean up the CAS PVCs. This step deletes the existing data on the CAS permstore(cas-default-permstore) and CAS data(cas-default-data) PVCs. The parameters of the script are:
cd ~/project/deploy/${current_namespace}/sas-bases/examples/restore/scripts/
./sas-backup-pv-copy-cleanup.sh gelcorp remove "default"
Output:
The cleanup pods are created, and they are in a running state.
Ensure that all pods are completed. To check the status of the cleanup pods, run the following command.
kubectl -n gelcorp get pods -l sas.com/backup-job-type=sas-backup-pv-copy-cleanup | grep 21bef2c
The script creates a Kubernetes Job that clears the key data from the CAS PVCs so that it can be restored from the backup package. Using the command provided in the script output we can view the status of the job and for more details we can view the log of the job.
kubectl -n gelcorp get pods -l sas.com/backup-job-type=sas-backup-pv-copy-cleanup | grep 21bef2c
kubectl -n gelcorp logs -l sas.com/backup-job-type=sas-backup-pv-copy-cleanup
With the CAS PVCs successfully cleaned, we can start up the CAS server(s) using scale-up-cas.sh. The parameters of the script are:
cd ~/project/deploy/${current_namespace}/sas-bases/examples/restore/scripts/
./scale-up-cas.sh gelcorp "default"
casdeployment.viya.sas.com/default patched
When the CAS server starts it checks the Restore Job configMap attribute SAS_DEPLOYMENT_START_MODE. If it is set to RESTORE, the CAS server will start and restore the data from the directory that matches the BACKUP_ID in the sas-cas-backup-data PVC.
You can check the CAS Server log to see if the restore was performed. The logs will show the start of the restore process and details of the restore of the backup content from the backup package to the target CAS persistent volumes.
kubectl logs sas-cas-server-default-controller -c sas-cas-server | grep -A 10 "RESTORE"
An important final step is to reset all SAS restore job configMap parameters. If we don't perform this step the CAS server will attempt to restore the backup from the package on every restart.
kubectl patch cm $restore_config_map --type json -p '[{ "op": "remove", "path": "/data/SAS_BACKUP_ID" },{"op": "remove", "path": "/data/SAS_DEPLOYMENT_START_MODE"}]'
Restore of a SAS Viya backup is now initiated using kubectl commands and scripts. The new and improved restore process is currently supported for Backup and Restore and Viya 4 to Viya 4 Migration. Currently, the old method of building and applying manifests is also supported. There are plans to add support for the new method for Viya 3. x to 4 Migration. I hope you found this useful. In the Backup and Restore area look for more exciting updates and related blog posts in the coming months.
The new restore process is documented:
Find more articles from SAS Global Enablement and Learning here.
Instructive & useful scenario
Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!
Data Literacy is for all, even absolute beginners. Jump on board with this free e-learning and boost your career prospects.