A SAS Viya deployment is comprised of many individual processes and making sure all of these processes are up and responding is one of the primary concerns for SAS administrators. There are many approaches to this problem, most of which lean heavily on the SAS Configuration Server to assess process availability because each service is supposed to register itself in the SAS Configuration Server when it starts. Relying on the SAS Configuration Server to tell us about the services in our deployment works well as long as each service starts well enough to register itself and that no service is stopped in such a way as to unregister itself. If either of these events occur the SAS Configuration Server has no knowledge of the service so it is effectively ignored when reporting process availability status. This is why administrators have to be especially careful when reviewing the Availability portlet on the SAS Environment Manager dashboard page.
The root problem is that the SAS Configuration Server has no knowledge of the processes that should be present but aren't. With SAS Viya 3.5, the sas-admin healthcheck plug-in provides administrators with the ability to check on processes that should be present without relying on the SAS Configuration Server. Let's take a look at this new administration feature.
The healthcheck plug-in has several options but in this post I am going to focus on the system-health check-status options. There are two modes for assessing system health
This is a key difference to understand. The complex mode not only reports on microservices but also includes checks for applications such as
For this post, I am going to focus on complex mode as it is the mode with which administrators can control the services examined by feeding it a list of services to check.
Let's see how the healthcheck plug-in can help an administrator assess system process availability. After authenticating myself to the sas-admin framework, I am going to issue a sas-admin healthcheck command in complex mode to look at the processes for a healthy, fully functional Viya 3.5 deployment. The results indicate that my deployment has 95 processes that were tested and of those 95, 1 reports as down.
./sas-admin healthcheck system-health check-status complex
Searching for services, applications, and infrastructure applications.
[Validating Health] .............................................................................................................
Services Endpoint Status HTTP Status Time of Call Duration
Discovery Table Provider /discoveryTableProvider down 503 2020-02-25T11:36:59.075Z 2358
"1" of "95" health validations failed.
The following errors were generated during the execution of this program:
The following error was encountered when making an endpoint call to "/discoveryTableProvider": 1 error occurred:
* The server failed to fulfill an apparently valid request.
The service that reports a 'down' status, discoveryTableProvider, is due to a licensing issue which does not allow the service to respond so the lack of a valid response is interpreted as a 'down' state. Therefore, I can safely ignore that service when assessing my system health.
So far so good. Let's add the --display-results option to display details for each of the services that were tested. As you can see from the following output, each service has an indication of the endpoint that is used to test process availability and the results show the HTTP status of each endpoint test. If you take a second to scroll through the information you will see that there are three sections of processes: microservices are listed first, followed by SAS web applications, which are followed by infrastructure processes such as CAS, the SAS Infrastructure Data Server, and the SAS Message Broker.
./sas-admin healthcheck system-health check-status complex --display-results
Searching for services, applications, and infrastructure applications.
[Validating Health] .............................................................................................................
Services Endpoint Status HTTP Status Time of Call Duration
annotations /annotations up 200 2020-02-28T13:21:26.272Z 69
Application Registry service /appRegistry up 200 2020-02-28T13:21:26.341Z 96
Audit service /audit up 200 2020-02-28T13:21:26.438Z 70
Authorization service /authorization up 200 2020-02-28T13:21:26.508Z 12
backup-agent /backup-agent up 404 2020-02-28T13:21:26.521Z 25
Cache Locator service /cachelocator up 200 2020-02-28T13:21:26.546Z 21
Cache Server service /cacheserver up 200 2020-02-28T13:21:26.568Z 21
casAccessManagement /casAccessManagement up 200 2020-02-28T13:21:27.474Z 70
casFormats /casFormats up 200 2020-02-28T13:21:27.544Z 74
CAS Management service /casManagement up 200 2020-02-28T13:21:27.618Z 22
CAS Proxy service /casProxy up 200 2020-02-28T13:21:27.641Z 21
CAS Row Sets service /casRowSets up 200 2020-02-28T13:21:27.662Z 73
Catalog service /catalog up 200 2020-02-28T13:21:27.735Z 72
codeDebugger /codeDebugger up 200 2020-02-28T13:21:27.807Z 16
Comments service /comments up 200 2020-02-28T13:21:27.824Z 84
compute /compute up 200 2020-02-28T13:21:27.909Z 52
Configuration service /configuration up 200 2020-02-28T13:21:27.961Z 72
credentials /credentials up 200 2020-02-28T13:21:28.034Z 70
Cross-domain Proxy service /crossdomainproxy up 400 2020-02-28T13:21:28.105Z 19
Data Discovery service /dataDiscovery up 200 2020-02-28T13:21:28.125Z 35
Data Plans service /dataPlans up 200 2020-02-28T13:21:28.161Z 793
Profile Results service /dataProfiles up 200 2020-02-28T13:21:28.954Z 921
Data Sources service /dataSources up 200 2020-02-28T13:21:29.876Z 88
Data Tables service /dataTables up 200 2020-02-28T13:21:29.964Z 602
Backup service /deploymentBackup up 200 2020-02-28T13:21:30.567Z 423
Device Management service /deviceManagement up 200 2020-02-28T13:21:30.990Z 94
Discovery Table Provider /discoveryTableProvider down 503 2020-02-28T13:21:31.085Z 1450
Files service /files up 200 2020-02-28T13:21:32.535Z 31
Folders service /folders up 200 2020-02-28T13:21:32.567Z 129
Fonts service /fonts up 200 2020-02-28T13:21:32.696Z 88
Geo Enrichment service /geoEnrichment up 200 2020-02-28T13:21:32.785Z 73
Identities service /identities up 200 2020-02-28T13:21:32.859Z 58
import9 /import9 up 200 2020-02-28T13:21:32.917Z 50
Job Flow Scheduling service /jobFlowScheduling up 200 2020-02-28T13:21:32.968Z 22
Launcher service /launcher up 200 2020-02-28T13:21:32.990Z 54
licenses /licenses up 200 2020-02-28T13:21:33.044Z 23
Links service /links up 200 2020-02-28T13:21:33.068Z 72
Mail service /mail up 200 2020-02-28T13:21:33.141Z 77
Maps service /maps up 200 2020-02-28T13:21:33.219Z 74
Micro Analytic Score service /microanalyticScore up 200 2020-02-28T13:21:33.293Z 62
Model Management service /modelManagement up 200 2020-02-28T13:21:33.356Z 80
Model Publish service /modelPublish up 200 2020-02-28T13:21:33.436Z 66
Model Repository service /modelRepository up 200 2020-02-28T13:21:33.502Z 51
monitoring /monitoring up 200 2020-02-28T13:21:33.554Z 16
Natural Language Generation service /naturalLanguageGeneration up 200 2020-02-28T13:21:33.570Z 75
Natural Language Understanding service /naturalLanguageUnderstanding up 200 2020-02-28T13:21:33.646Z 90
Notifications service /notifications up 200 2020-02-28T13:21:33.736Z 63
Preferences service /preferences up 200 2020-02-28T13:21:33.918Z 56
Projects service /projects up 200 2020-02-28T13:21:33.975Z 83
Relationships service /relationships up 200 2020-02-28T13:21:34.212Z 68
Report Alerts service /reportAlerts up 200 2020-02-28T13:21:34.280Z 44
Report Data service /reportData up 200 2020-02-28T13:21:34.325Z 76
Report Distribution service /reportDistribution up 200 2020-02-28T13:21:34.401Z 16
Report Images service /reportImages up 200 2020-02-28T13:21:34.417Z 228
Report Packages service /reportPackages up 200 2020-02-28T13:21:34.646Z 123
Report Renderer service /reportRenderer up 200 2020-02-28T13:21:34.770Z 136
Report Templates service /reportTemplates up 200 2020-02-28T13:21:34.907Z 52
Report Transforms service /reportTransforms up 200 2020-02-28T13:21:34.959Z 44
reportViewerNaturalLanguageUnderstanding /reportViewerNaturalLanguageUnderstanding up 200 2020-02-28T13:21:35.004Z 138
Reports Persistence service /reports up 200 2020-02-28T13:21:35.143Z 467
Row Sets service /rowSets up 200 2020-02-28T13:21:35.610Z 351
Schedule service /scheduler up 200 2020-02-28T13:21:35.961Z 63
Score Definition service /scoreDefinitions up 200 2020-02-28T13:21:36.025Z 885
Score Execution service /scoreExecution up 200 2020-02-28T13:21:36.910Z 87
Search service /search up 200 2020-02-28T13:21:36.998Z 57
Search Index service /searchIndex up 200 2020-02-28T13:21:37.056Z 547
templates /templates up 200 2020-02-28T13:21:37.603Z 130
Tenant service /tenant up 200 2020-02-28T13:21:37.734Z 202
themeContent /themeContent up 200 2020-02-28T13:21:37.936Z 143
Themes service /themes up 200 2020-02-28T13:21:38.079Z 48
Thumbnails service /thumbnails up 200 2020-02-28T13:21:38.127Z 153
Transfer service /transfer up 200 2020-02-28T13:21:38.281Z 110
Transformations service /transformations up 200 2020-02-28T13:21:38.391Z 54
types /types up 200 2020-02-28T13:21:38.446Z 59
Web Data Access service /webDataAccess up 200 2020-02-28T13:21:38.506Z 85
Workflow service /workflow up 200 2020-02-28T13:21:38.591Z 62
Workflow Definition service /workflowDefinition up 200 2020-02-28T13:21:38.653Z 68
Workflow History service /workflowHistory up 200 2020-02-28T13:21:38.721Z 48
Applications Endpoint Status HTTP Status Time of Call Duration
SASBackupManager /SASBackupManager up 200 2020-02-28T13:21:20.608Z 192
SASCodeDebugger /SASCodeDebugger up 200 2020-02-28T13:21:20.800Z 1115
SAS Data Explorer /SASDataExplorer up 200 2020-02-28T13:21:21.916Z 937
SAS Data Studio /SASDataStudio up 200 2020-02-28T13:21:22.853Z 72
SAS Drive /SASDrive up 200 2020-02-28T13:21:22.925Z 1101
SAS Environment Manager /SASEnvironmentManager up 200 2020-02-28T13:21:24.027Z 173
SAS Graph Builder /SASGraphBuilder up 200 2020-02-28T13:21:24.200Z 152
SAS Job Execution /SASJobExecution up 200 2020-02-28T13:21:24.353Z 880
SAS Lineage /SASLineage up 200 2020-02-28T13:21:25.233Z 200
SAS Logon Manager /SASLogon up 200 2020-02-28T13:21:25.434Z 51
SAS Model Manager /SASModelManager up 200 2020-02-28T13:21:25.486Z 184
SASStudio /SASStudio up 200 2020-02-28T13:21:25.671Z 26
SAS Studio Viya /SASStudioV up 200 2020-02-28T13:21:25.697Z 72
SAS Theme Designer /SASThemeDesigner up 200 2020-02-28T13:21:25.769Z 183
SAS Visual Analytics /SASVisualAnalytics up 200 2020-02-28T13:21:25.952Z 156
SAS Workflow Manager /SASWorkflowManager up 200 2020-02-28T13:21:26.109Z 162
Infrastructure Applications Status Time of Call Duration
cas-shared-default up 2020-02-28T13:21:26.589Z 760
cas-shared-default-http up 2020-02-28T13:21:27.350Z 123
SAS Infrastructure Data Server up 2020-02-28T13:21:33.799Z 118
SAS Message Broker up 2020-02-28T13:21:34.058Z 117
"1" of "95" health validations failed.
The items with statuses in yellow are functioning properly, but cannot be directly verified due to their nature.
The following errors were generated during the execution of this program:
The following error was encountered when making an endpoint call to "/discoveryTableProvider": 1 error occurred:
* The server failed to fulfill an apparently valid request.
So while we have a healthy system with all processes that need to be up, let's re-run the healthcheck command with the --create-yaml option which will create an output file named complexCheck.yml containing details for all of the tested processes.
./sas-admin healthcheck system-health check-status complex --create-yaml complexCheck.yml
Searching for services, applications, and infrastructure applications.
[Validating Health] ..............................................................................................................
Services Endpoint Status HTTP Status Time of Call Duration
Discovery Table Provider /discoveryTableProvider down 503 2020-02-25T11:49:09.689Z 1804
"1" of "95" health validations failed.
Here is a look at the complexCheck.yml file. As the comments explain, each service has an associated endpoint for which the service to call to assess health is obtained from the SAS Configuration Server. The interesting thing though is that we now have a list of every service that should be present in a healthy deployment. We can modify this yaml file to suit our purposes such as splitting out separate checks for the web application or the infrastructure components if that suits our needs. We also have the option to add additional service checks providing we know the endpoint to call or we can specify an exact endpoint to use for any of the existing services.
####################################################################################################################
# A YAML file can be specified for the "complex" command.
# At the minimum, a list of service or application names must be specified.
# For every service or application listed, the program goes to Consul to find the correct head endpoint to use.
# You can specify additional endpoints to test in the YAML file. If Consul finds a match for that service or application,
# calls are made to the head endpoint in addition to making calls to all the endpoints specified in the YAML file.
# If you specify a service or application name in the file that Consul does not recognize, the program tests only
# the endpoints specified in the YAML file. If there are no endpoints, then that item is skipped.
# Pasted below is a sample format for a complex check configuration file:
#- Name: Files
#- Name: Advanced Analytics Components service
# Endpoints:
# - /analyticsComponents/commons/health
# - /analyticsComponents/components
# - /analyticsComponents/templates
# - /files/files
#- Name: Advanced Analytics Data Segmentation service
# Endpoints:
# - /analyticsDataSegmentation/commons/health
# - /analyticsDataSegmentation/plans
#- Name: Analytics Events service
####################################################################################################################
# Applications:
- Name: SASBackupManager
Endpoints:
- /SASBackupManager
- Name: SASCodeDebugger
Endpoints:
- /SASCodeDebugger
- Name: SAS Data Explorer
Endpoints:
- /SASDataExplorer
- Name: SAS Data Studio
Endpoints:
- /SASDataStudio
- Name: SAS Drive
Endpoints:
- /SASDrive
- Name: SAS Environment Manager
Endpoints:
- /SASEnvironmentManager
- Name: SAS Graph Builder
Endpoints:
- /SASGraphBuilder
- Name: SAS Job Execution
Endpoints:
- /SASJobExecution
- Name: SAS Lineage
Endpoints:
- /SASLineage
- Name: SAS Logon Manager
Endpoints:
- /SASLogon
- Name: SAS Model Manager
Endpoints:
- /SASModelManager
- Name: SASStudio
Endpoints:
- /SASStudio
- Name: SAS Studio Viya
Endpoints:
- /SASStudioV
- Name: SAS Theme Designer
Endpoints:
- /SASThemeDesigner
- Name: SAS Visual Analytics
Endpoints:
- /SASVisualAnalytics
- Name: SAS Workflow Manager
Endpoints:
- /SASWorkflowManager
# Services:
- Name: annotations
Endpoints:
- /annotations
- Name: Application Registry service
Endpoints:
- /appRegistry
- Name: Audit service
Endpoints:
- /audit
- Name: Authorization service
Endpoints:
- /authorization
- Name: backup-agent
Endpoints:
- /backup-agent
- Name: Cache Locator service
Endpoints:
- /cachelocator
- Name: Cache Server service
Endpoints:
- /cacheserver
- Name: casAccessManagement
Endpoints:
- /casAccessManagement
- Name: casFormats
Endpoints:
- /casFormats
- Name: CAS Management service
Endpoints:
- /casManagement
- Name: CAS Proxy service
Endpoints:
- /casProxy
- Name: CAS Row Sets service
Endpoints:
- /casRowSets
- Name: Catalog service
Endpoints:
- /catalog
- Name: codeDebugger
Endpoints:
- /codeDebugger
- Name: Comments service
Endpoints:
- /comments
- Name: compute
Endpoints:
- /compute
- Name: Configuration service
Endpoints:
- /configuration
- Name: credentials
Endpoints:
- /credentials
- Name: Cross-domain Proxy service
Endpoints:
- /crossdomainproxy
- Name: Data Discovery service
Endpoints:
- /dataDiscovery
- Name: Data Plans service
Endpoints:
- /dataPlans
- Name: Profile Results service
Endpoints:
- /dataProfiles
- Name: Data Sources service
Endpoints:
- /dataSources
- Name: Data Tables service
Endpoints:
- /dataTables
- Name: Backup service
Endpoints:
- /deploymentBackup
- Name: Device Management service
Endpoints:
- /deviceManagement
- Name: Discovery Table Provider
Endpoints:
- /discoveryTableProvider
- Name: Files service
Endpoints:
- /files
- Name: Folders service
Endpoints:
- /folders
- Name: Fonts service
Endpoints:
- /fonts
- Name: Geo Enrichment service
Endpoints:
- /geoEnrichment
- Name: Graph Template Service
Endpoints:
- /graphTemplates
- Name: Identities service
Endpoints:
- /identities
- Name: import9
Endpoints:
- /import9
- Name: Job Flow Scheduling service
Endpoints:
- /jobFlowScheduling
- Name: Launcher service
Endpoints:
- /launcher
- Name: licenses
Endpoints:
- /licenses
- Name: Links service
Endpoints:
- /links
- Name: Mail service
Endpoints:
- /mail
- Name: Maps service
Endpoints:
- /maps
- Name: Micro Analytic Score service
Endpoints:
- /microanalyticScore
- Name: Model Management service
Endpoints:
- /modelManagement
- Name: Model Publish service
Endpoints:
- /modelPublish
- Name: Model Repository service
Endpoints:
- /modelRepository
- Name: monitoring
Endpoints:
- /monitoring
- Name: Natural Language Generation service
Endpoints:
- /naturalLanguageGeneration
- Name: Natural Language Understanding service
Endpoints:
- /naturalLanguageUnderstanding
- Name: Notifications service
Endpoints:
- /notifications
- Name: Preferences service
Endpoints:
- /preferences
- Name: Projects service
Endpoints:
- /projects
- Name: Relationships service
Endpoints:
- /relationships
- Name: Report Alerts service
Endpoints:
- /reportAlerts
- Name: Report Data service
Endpoints:
- /reportData
- Name: Report Distribution service
Endpoints:
- /reportDistribution
- Name: Report Images service
Endpoints:
- /reportImages
- Name: Report Packages service
Endpoints:
- /reportPackages
- Name: Report Renderer service
Endpoints:
- /reportRenderer
- Name: Report Templates service
Endpoints:
- /reportTemplates
- Name: Report Transforms service
Endpoints:
- /reportTransforms
- Name: reportViewerNaturalLanguageUnderstanding
Endpoints:
- /reportViewerNaturalLanguageUnderstanding
- Name: Reports Persistence service
Endpoints:
- /reports
- Name: Row Sets service
Endpoints:
- /rowSets
- Name: Schedule service
Endpoints:
- /scheduler
- Name: Score Definition service
Endpoints:
- /scoreDefinitions
- Name: Score Execution service
Endpoints:
- /scoreExecution
- Name: Search service
Endpoints:
- /search
- Name: Search Index service
Endpoints:
- /searchIndex
- Name: templates
Endpoints:
- /templates
- Name: Tenant service
Endpoints:
- /tenant
- Name: themeContent
Endpoints:
- /themeContent
- Name: Themes service
Endpoints:
- /themes
- Name: Thumbnails service
Endpoints:
- /thumbnails
- Name: Transfer service
Endpoints:
- /transfer
- Name: Transformations service
Endpoints:
- /transformations
- Name: types
Endpoints:
- /types
- Name: Web Data Access service
Endpoints:
- /webDataAccess
- Name: Workflow service
Endpoints:
- /workflow
- Name: Workflow Definition service
Endpoints:
- /workflowDefinition
- Name: Workflow History service
Endpoints:
- /workflowHistory
# Infrastructure Applications:
- Name: cas-shared-default
Endpoints:
- /cas-shared-default
- Name: cas-shared-default-http
Endpoints:
- /cas-shared-default-http
- Name: SAS Infrastructure Data Server
Endpoints:
- /postgres
- Name: SAS Message Broker
Endpoints:
- /rabbitmq
Let's see how having this list of services can be especially useful to administrators.
Suppose one of our services is either accidentally stopped or perhaps did not start sufficiently to register itself in the SAS Configuration Server. I'm going to simulate this by stopping the graph templates service.
Let's see how this is reported by the healthcheck using the default method of obtaining the list of available services from the SAS Configuration Server.
./sas-admin healthcheck system-health check-status complex
Searching for services, applications, and infrastructure applications.
[Validating Health] .............................................................................................................
Services Endpoint Status HTTP Status Time of Call Duration
Discovery Table Provider /discoveryTableProvider down 503 2020-02-25T11:36:59.075Z 2358
"1" of "94" health validations failed.
The following errors were generated during the execution of this program:
The following error was encountered when making an endpoint call to "/discoveryTableProvider": 1 error occurred:
* The server failed to fulfill an apparently valid request.
Interestingly, the results still indicate that only the expected discoveryTableProvider service is down. If you are a very observant administrator, you might notice that the total number of health validations is now 94 instead of the 95 we had earlier with a healthy system. If you were in a hurry and not paying close attention, you might miss that small difference and assume that everything is ok when in fact there is a missing service.
Let's re-run the healthcheck but this time pass it our yaml file as the list of services we want checked by adding the --source-location option to the command.
./sas-admin healthcheck system-health check-status complex --source-location complexCheck.yml
[Validating Health] ...................................................................................................
Services Endpoint Status HTTP Status Time of Call Duration
Discovery Table Provider /discoveryTableProvider down 503 2020-02-25T11:57:23.201Z 1499
Infrastructure Applications Status Time of Call Duration
Graph Template Service down 2020-02-25T11:57:25.047Z 1
"2" of "95" health validations failed.
The following errors were generated during the execution of this program:
The following error was encountered when making an endpoint call to "/discoveryTableProvider": 1 error occurred:
* The server failed to fulfill an apparently valid request.
The following error was encountered when making an endpoint call to "/graphTemplates": 1 error occurred:
* The requested resource could not be found.
Aha! Now we can clearly see that there are two services down and the total number of health validations is back up to 95. So forcing healthcheck to look for all of the services we expect to have in our deployment appears to provide a bit of protection for administrators in cases where services do not get registered into the SAS Configuration Server.
So as an administrator, I can now proactively monitor processes that I know should be in my deployment without having to rely completely on the SAS Configuration Server. This should enable me to more reliably detect process issues and help maintain more robust system health.
There is much more to the healthcheck plug-in that I have touched on in this post so please take a look at the documentation for a more comprehensive understanding of the many options.
The sas-admin command line interface is one of, if not the best, administration tools for Viya. If you are not yet familiar with sas-admin I recommend that you read Gerry Nelson's posts SAS Viya command-line interfaces for Administration and Keeping the SAS Administration Command-Line interfaces up-to-date to learn more.
Hi Scott
Useful blog as always.
I presume the simple way to get around the warning message for the /discoveryTableProvider endpoint is to comment it out from the generated yaml file?
Will this be fixed in a future release as it's a usefull addition to the admin toolkit?
regards
Alan
Thanks Scott
Actually, after upgrading my Viya install from v3.4 to v3.5 & updating the yaml file used by the healthcheck process, this was the only endpoint which returned an error - exactly as per your example (all was well in Environment Manager though)
Something for Tech Suppprt I'd say?
Alan
Hi Scott
I tried this for just this endpoint with verbose logging & as you say, the error relates to a license:
{
"errorCode":0,
"message":"The product license was not found.",
"details":["traceId: a4dae0be1d56cf7e","path: /discoveryTableProvider/"],
"links":[],
"version":2,
"httpStatusCode":503
}
Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!
Data Literacy is for all, even absolute beginners. Jump on board with this free e-learning and boost your career prospects.