Alertmanager provides the bulk of the alerting capabilities in SAS Viya, but the eagle-eyed may have noticed that Grafana too has an Alerting page. So what are the differences between alerting in Grafana and in Prometheus/Alertmanager? Are there any integration points? In this post, we'll try to answer these questions and learn about some important limitations and considerations.
So, what kind of alerting is available in Grafana? Well, Grafana uses PromQL queries to display metrics about cluster activity from Prometheus in dashboard panels. Panels have an Alerts tab which allows an administrator to define an alert for the particular metric query for that panel. So alerts in Grafana are based on panels, which somewhat limits the scope of Grafana's alerting capabilites. Alerts created the 'standard' way (as PrometheusRules) can be based on any desired metrics.
Another thing to consider is that Grafana dashboard panels often use template variables in the queries to extract and display metrics across nodes, pods, etc. These kinds of variables are prefixed with a dollar sign ($). For example, $node, $pod, $instance, $container, etc. can be used to display metrics across multiple instances of a component. This is great for graphing multiple metric series in a chart panel, but not so great for alerting.
In fact, alerts cannot be created for panels/charts that use template variables; another limitation. If we do meet these two criteria though, we can define an alert. The process is nice and easy, and can be done directly in the Alerts tab (yes, alerts can be created in the UI - something that isn't possible with alerts in Prometheus).
Note also that some setup may be required. For example, for email alerts, the Grafana config must be modified to define the connection information for your SMTP server. That is achieved by adding a block that looks like the following in
monitoring/user-values-prom-operator.yaml, and then running
monitoring/bin/deploy_monitoring_cluster.sh to deploy (or redeploy) the monitoring applications:
grafana: # See https://grafana.com/docs/grafana/latest/administration/configuration/#smtp "grafana.ini": smtp: enabled: "true" host: rext03-0171.race.sas.com:1025 from_address: firstname.lastname@example.org
Email is certainly not the only option, though. Notifications can be sent to Google Hangouts, Slack, Discord, Teams, and many more channels. View the list of possible options in the drop-down box in the Alerting > Notification Channels page. A test notification can also be sent from this page.
Select any image to see a larger version.
Mobile users: To view the images, select the "Full" version at the bottom of the page.
Alerts appear and can be paused & resumed in the Alerting > Alert Rules page.
When the alert condition is resolved, it automatically stops firing and appears in the Alert Rules page with a status of
OK. We can also view its history by clicking the
State History button in the alert definition screen - a feature that isn't available in Alertmanager:
So we can create alerts in Grafana, and we can create them in Prometheus/Alertmanager. Can we integrate the two? Well, yes and no.
As outlined above, Grafana alerts and Prometheus alerts are completely independent of each other, and each application has their own mechanisms and processes for defining, triggering, acknowledging, and sending notifications for alerts. At least, that's true for the version of Grafana that comes with SAS Viya Monitoring for Kubernetes; there is a subscription-based version of Grafana (Grafana Cloud) that does support integration with Alertmanager, it's unlikely that many (any?) SAS customers will have a license for it.
However, some integration is possible. For instance, Grafana natively supports Alertmanager as a notification channel. That is, firing Grafana alerts can be sent to Alertmanager so that they can be managed with other alerts.
There is also a way we can see what is happening in Alertmanager from within Grafana. Conveniently, SAS Viya Monitoring includes an Alerts dashboard out-of-the-box that provides a view of alerts firing in Alertmanager.
In this dashboard, we can see a count of firing and pending alerts, and a time-series chart displaying the firing alerts over time (from when they began firing). We are also provided with a count of the instances firing for each alert, and a severity label. The dashboard can be customized, or new dashboards can be added (there are others on the Grafana Labs site for interacting with Alertmanager).
The limitation, of course, is that this is a read-only view of alerts. We can't group, filter, or silence them like we can in the Alertmanager UI. We also can't create, update or delete alerts (or even view inactive/non-firing ones). But it does provide a nice view of firing alerts to complement the other bundled monitoring dashboards, and in some cases saves an administrator having to jump between different applications.
In most cases, Alertmanager is likely to be best suited for managing alerts for several reasons:
Grafana alerting, on the other hand, is simple to use and might be suited to some but not all scenarios. It certainly has some advantages over Alertmanager, but be aware of the limitations. Of course, Alertmanager also has its own limitations, so it's a matter of horses for courses. Refer to the documentation for more info.
Have you tried your hand at setting up alerts in SAS Viya using these or other tools? Please comment below to share your own experiences
Thanks for reading!
Find more articles from SAS Global Enablement and Learning here.
Registration is open! SAS is returning to Vegas for an AI and analytics experience like no other! Whether you're an executive, manager, end user or SAS partner, SAS Innovate is designed for everyone on your team. Register for just $495 by 12/31/2023.
If you are interested in speaking, there is still time to submit a session idea. More details are posted on the website.
Data Literacy is for all, even absolute beginners. Jump on board with this free e-learning and boost your career prospects.