SAS Viya on Kubernetes is a complicated system comprised of over 140 pods and over 160 distinct services, and that is for a pretty simple deployment. When an administrator starts a SAS Viya deployment, all of these pods race to initialize and establish communication with the other components they depend on. So how is an administrator supposed to determine when SAS Viya is up and ready for users? This very issue was raised by one of the SAS Viya early preview customers so before SAS Viya was publicly released, SAS added the sas-readiness service to help assess the readiness of the deployment for work. The primary purpose of this service is to serve as a single contact point for administrators to determine when all of the SAS Viya services are ready to accept traffic.
One of the tough questions that had to be answered was, what does ready mean? Ask ten people and you will probably get eleven opinions. Does it mean there are a minimum number of services up and running for a user to logon? Does it mean a user can log on and run a batch job? Run a Visual Analytics report? Load data into CAS? The problem with approaching the question from a functional standpoint is that no two SAS Viya deployments are used in identical ways so what ready means for one may not mean the same for another.
For now, the sas-readiness service works under the assumption that if all services in the deployment are accepting traffic, we can presume that Viya is functionally ready and the system should be responsive to user input. Yes, it is a very granular approach to ready but using this standard has some advantages which we will look at shortly.
Every Viya service exposes an /internal/ready endpoint that returns an HTTP response code in the 200's if the service is ready to receive traffic - any other value is interpreted as 'not ready.' The sas-readiness service probes the /internal/ready endpoint using HTTP GET and checks the return code. Nice and tidy...and fast. Because the probe is so light weight, sas-readiness is able to re-probe each service every 30 seconds without overburdening system resources.
If sas-readiness detects any failed requests, it emits a single log message that reports on all services that responded with a failure code. For example, this is a message I captured during startup of one of my test deployments.
{
"level": "info",
"version": 1,
"source": "sas-readiness",
"messageKey": "readiness-log-icu.check.failed.log",
"messageParameters": {
"check": "sas-endpoints-ready",
"message": "17 endpoints have no available addresses: sas-audit,sas-connect-spawner,sas-data-flows,sas-decision-manager-app,sas-device-management,sas-drive-app,sas-graph-builder-app,sas-job-execution-app,sas-lineage-app,sas-model-manager-app,sas-model-studio-app,sas-report-renderer,sas-score-definitions,sas-score-execution,sas-theme-designer-app,sas-visual-analytics-app,sas-workflow-manager-app"
},
"properties": {
"caller": "checks/aggregate_ready.go:69"
},
"attributes": {
"failedCheck": {
"version": 0,
"status": 1,
"message": "17 endpoints have no available addresses: sas-audit,sas-connect-spawner,sas-data-flows,sas-decision-manager-app,sas-device-management,sas-drive-app,sas-graph-builder-app,sas-job-execution-app,sas-lineage-app,sas-model-manager-app,sas-model-studio-app,sas-report-renderer,sas-score-definitions,sas-score-execution,sas-theme-designer-app,sas-visual-analytics-app,sas-workflow-manager-app",
"timeStamp": "2021-02-01T17:40:49.715387407Z",
"name": "sas-endpoints-ready",
"attributes": {
"notReadyEndpoints": [
"sas-audit",
"sas-connect-spawner",
"sas-data-flows",
"sas-decision-manager-app",
"sas-device-management",
"sas-drive-app",
"sas-graph-builder-app",
"sas-job-execution-app",
"sas-lineage-app",
"sas-model-manager-app",
"sas-model-studio-app",
"sas-report-renderer",
"sas-score-definitions",
"sas-score-execution",
"sas-theme-designer-app",
"sas-visual-analytics-app",
"sas-workflow-manager-app"
]
}
}
},
"timeStamp": "2021-02-01T17:40:49.912356+00:00",
"message": "The check \"sas-endpoints-ready\" failed - 17 endpoints have no available addresses: sas-audit,sas-connect-spawner,sas-data-flows,sas-decision-manager-app,sas-device-management,sas-drive-app,sas-graph-builder-app,sas-job-execution-app,sas-lineage-app,sas-model-manager-app,sas-model-studio-app,sas-report-renderer,sas-score-definitions,sas-score-execution,sas-theme-designer-app,sas-visual-analytics-app,sas-workflow-manager-app"
}
Even though 17 services were not ready, this one log message aggregates the information across all non-responsive services. Not only does this simplify interpreting results, it also helps to reduce the volume of log messages and keeps the sas-readiness response time lightening quick.
When the sas-readiness probe receives success codes from all known services, it does two things:
{
"level": "info",
"version": 1,
"source": "sas-readiness",
"messageKey": "readiness-log-icu.checks.all.passed.log",
"properties": {
"caller": "checks/aggregate_ready.go:79"
},
"timeStamp": "2021-02-01T17:42:20.136612+00:00",
"message": "All checks passed. Marking as ready."
}
Even though the system has been deemed ready, the sas-readiness service continues to probe again every 30 seconds. However, once the 'All checks passed' message has been emitted, subsequent 'ready' results will not be noted in the log. In fact, no additional log messages will be emitted by the sas-readiness service until a failure is detected. This, again, reduces log volume and prevents redundant 'system ready' messages from appearing.
As an administrator, you do not really need to go log spelunking to determine whether your Viya deployment is ready or not. Because the state of the sas-readiness pod itself reflects the readiness of the deployment, you can use the following command to have Kubernetes monitor the sas-readiness pod and let you know when the pod's condition has been set to Ready. In this example, I have asked to keep testing the condition for 30 minutes before giving up.
$ kubectl wait --for=condition=Ready pod --selector="app.kubernetes.io/name=sas-readiness" --timeout=1800s
If the timeout threshold is exceeded before the sas-readiness pod returns Ready, I will see this message which means that I need to decide if something is wrong or the system just did not come up within 30 minutes.
error: timed out waiting for the condition on pods/sas-readiness-7d487b9fd-v5rdt
However, the message I want to see is this one, which indicates that the sas-readiness pod is reporting Ready. And since that can only happen when the sas-readiness service successfully marked the system as ready, I can presume that my Viya deployment is ready for users.
pod/sas-readiness-7d487b9fd-v5rdt condition met
An administrator could also directly query the status of the sas-readiness deployment to see if readiness has been achieved. This technique would be a tad more convenient for scripting purposes as it will return a 0 or 1 depending on whether sas-readiness has marked itself as ready. Here's an example command:
kubectl get deployments sas-readiness -o jsonpath='{.status.readyReplicas}'
As an administrator attempts to assess the readiness of their Viya deployment, there are a few things that are not addressed by the sas-readiness service.
So to wrap this up, Viya administrators now have a single point of contact to assess the readiness of a Viya deployment to accept traffic. And while there are certainly other facets of the system that may affect the readiness of SAS Viya, the sas-readiness services is a tool administrators can use to gain confidence that at least the Viya services are responding and should be ready for users.
Find more articles from SAS Global Enablement and Learning here.
Good Call !
Join us for SAS Innovate 2025, our biggest and most exciting global event of the year, in Orlando, FL, from May 6-9.
Early bird rate extended! Save $200 when you sign up by March 31.
Data Literacy is for all, even absolute beginners. Jump on board with this free e-learning and boost your career prospects.