SAS Viya Monitoring for Kubernetes now offers a set of pre-configured Grafana alerts that can be deployed to enhance monitoring of your SAS Viya platform. These new alerts streamline alerting setup by providing administrators with the option to include SAS Viya-specific alerts right from the moment of deployment, complementing the existing Kubernetes cluster-level alerts deployed with Prometheus to enhance observability and operational readiness.
Beginning in version 1.2.38 (June 2025), the monitoring deployment ships an optional* suite of SAS Viya-specific Grafana alerts. These alerts proactively monitor critical SAS Viya components and services to promptly flag potential issues such as resource exhaustion, pod restarts, and message queue backlogs. Post-deployment, administrators can configure notifiers (such as email, Slack, or SMS, as outlined in this earlier post) via Grafana’s web interface or add custom alerts by placing YAML configuration files from samples/alerts in $USER_DIR/monitoring/alerting/ before running the deployment script (deploy_monitoring_cluster.sh).
Version 1.2.42 (September 2025) also introduced an automatic SMTP server configuration feature, enabling seamless setup of email notifications by toggling the AUTOGENERATE_SMTP flag and providing SMTP server details via environment variables.
* The Grafana-managed alerts were deployed by default with SAS Viya Monitoring for Kubernetes from versions 1.2.38-1.2.41, but are optional from 1.2.42 and can be deployed from the samples/alerts directory.
| Alert Name | Description | Condition |
|
Catalog DB Connections High |
Triggers if active catalog database connections exceed 21. |
In-use DB connections > 21 |
|
Crunchy Backrest Repo Usage High |
Alerts if PostgreSQL backrest repo storage usage is above 50%. | Storage usage > 50% |
|
Crunchy PGData Usage High |
Monitors PostgreSQL data directory usage for WAL log accumulation beyond 50%. |
PGData filesystem > 50% full |
| NFS Share Usage High | Signals when NFS storage used by CAS surpasses 85%. | NFS share usage > 85% |
| CAS Restart Detected | Detects CAS pod restarts by identifying pods with uptime less than 15 minutes. |
CAS pod uptime < 15 minutes |
|
CAS Memory Usage High |
Flags high memory consumption by CAS pods. |
Custom threshold (see YAML) |
|
Viya Pod Restart Count High |
Alerts when pods restart more than 20 times. | Restart count > 20 |
| Viya Readiness Probe Failed | Indicates if critical SAS readiness probes fail. |
Pod not Ready for critical period |
|
RabbitMQ Ready Queue Backlog |
Warns of excessive backlog in RabbitMQ message queues, possibly affecting downstream processing. | Ready messages > 10,000 |
|
RabbitMQ Unacked Queue Backlog |
Checks for unacknowledged RabbitMQ messages > 5,000, suggesting downstream service problems. | Unacked messages > 5,000 |
The alert manifests are provided in standard Prometheus alerting rule format (using PromQL) and can be customised before deployment to suit different SAS Viya environments. Administrators can:
The samples/alerts directory of the viya4-monitoring-kubernetes GitHub repository contains README documentation and example alert files that serve as a guide for customisation and best practices. The sample alerts can be copied to USER_DIR/monitoring/alerting and customised to suit your particular SAS Viya environment prior to deployment.
Prometheus, when deployed with SAS Viya Monitoring for Kubernetes, already ships with a set of out-of-the-box rules defined as PrometheusRule resources, which are entirely distinct from (and complementary to) the new SAS-focused set of Grafana-managed sample alerts. These default PrometheusRules monitor common, general Kubernetes health and performance conditions such as:
These are broadly useful for any Kubernetes workload (not just SAS) and are managed separately from the SAS-specific Grafana-managed sample alerts. Both types of alerts can be viewed in Grafana, and additional custom alerts can be added to both. See my earlier post for a comparison of the alerting features and functions between Grafana and Prometheus Alertmanager.
The optional set of pre-configured SAS Viya-specific Grafana alerts provides administrators with a powerful, ready-made alerting framework, enhancing platform monitoring beyond generic Kubernetes signals. These alerts are flexible and extensible to fit diverse operational environments and work side-by-side with Prometheus's general alerts for a comprehensive monitoring solution. The added SMTP configuration and notifier flexibility further simplify integrating alerting workflows into organisational communication channels.
The new alerts empower SAS Viya platform teams to detect and respond to issues faster, as well as maintain stability and performance of SAS Viya deployments from the moment the monitoring tools are deployed.
Find more articles from SAS Global Enablement and Learning here.
April 27 – 30 | Gaylord Texan | Grapevine, Texas
Walk in ready to learn. Walk out ready to deliver. This is the data and AI conference you can't afford to miss.
Register now and lock in 2025 pricing—just $495!
The rapid growth of AI technologies is driving an AI skills gap and demand for AI talent. Ready to grow your AI literacy? SAS offers free ways to get started for beginners, business leaders, and analytics professionals of all skill levels. Your future self will thank you.