BookmarkSubscribeRSS Feed

Alert on SAS Viya license expiry with Alertmanager

Started ‎11-03-2021 by
Modified ‎11-03-2021 by
Views 3,480

Alertmanager is the de-facto alerting solution that (along with Prometheus) forms part of the monitoring solution that is widely considered to be the industry standard. As highlighted in earlier posts, Alertmanager is deployed with SAS Viya Monitoring for Kubernetes and is primarily responsible for sending alerts when SAS Viya (and other) resources in the Kubernetes cluster aren't behaving as expected. When defined thresholds (based on metrics Prometheus has collected about things like resource consumption) are met, Alertmanager is notified, which is then tasked with sending notifications to appropriate channels. But Alertmanager isn't limited to working with Prometheus metrics. In fact, it has a REST API that can be used by any client to receive alerts and send notifications. In this post, we'll demonstrate one specific example; how to trigger alerts in Alertmanager when the SAS Viya license is approaching its expiration date.

 

At a high-level, this approach is made up of several components:

 

  • A sas-viya CLI command to query the license from the command line.
  • A simple script that compares the license expiry date to the current date. If the difference is less than thirty days, it sends a HTTP POST request to Alertmanager to trigger an alert.
  • A new Kubernetes CronJob that executes the script in a container on a defined schedule.
  • If necessary, updates to Alertmanager routing to ensure the alert notifications are sent via appropriate channels to the appropriate people.

 

The sas-viya CLI has a licenses plug-in which can display the expiry dates of each licensed product, along with warning and grace period dates. The following command extracts only the expiry date for Base SAS from the CLI command's output.

 

/opt/sas/viya/home/bin/sas-viya --output text licenses products list | awk '/^Base SAS/'| awk '{print $8}'

 

This command will be used in the shell script.

 

Remember that the CLI first requires a user token to be obtained, which can also be done in the script, thanks to the loginviauthinfo.py program from the pyviyatools project.

 

The script also needs to extract the current date and compare with the expiry date, which can be done by converting the two dates to seconds elapsed since UNIX epoch, and finding the difference. If the difference is less than 2,592,000 seconds (30 days), the alert condition is considered to have been met. To instruct Alertmanager to trigger the alert, a cURL command is used to send a REST API call containing the necessary alert attributes to Alertmanager. Note that there's also a v2 of the Alertmanager API, which has some added features and improvements that can make the example below more elegant.

 

tee /shared/gelcontent/batch/code/license-check.sh > /dev/null << EOF

#!/bin/bash

url='http://alertmanager.myserver.race.sas.com/api/v1/alerts'
thirtydays_s=2592000

# login to CLI 
/pyviyatools/loginviauthinfo.py  -f ~/.authinfo_sasboot

# store license expiry date in variable
echo "Checking license..."
licensedate=$(/opt/sas/viya/home/bin/sas-viya --output text licenses products list | awk '/^Base SAS/'| awk '{print $8}')
licensedate_s=$(date -d "$licensedate" "+%s")
echo "SAS license expiry date: " $licensedate

currdate=$(date +%F)
currdate_s=$(date -d "$currdate" "+%s")
echo "Today's date: " $currdate
diff_s=$(expr $licensedate_s - $currdate_s)


if [ "$diff_s" -le "$thirtydays_s" ]; then

  alertname="SAS license expiring"
  
  # send request
  curl -XPOST $url -d "[{ 
    \"status\": \"firing\",
    \"labels\": {
      \"alertname\": \"$alertname\",
      \"severity\":\"warning\",
      \"instance\": \"Gelcorp Dev environment\"
      },
    \"annotations\": {
      \"summary\": \"Base SAS license expiring within 30 days.\",
      \"runbook\": \"http://internal.gelcorp.com/wiki/alerts/license-expiry\"
    },
    \"generatorURL\": \"https://gelcorp.myserver.race.sas.com/SASEnvironmentManager/licenses\"
  }]"

  echo ""

fi
EOF

 

Ensure the script is executable by running:

 

chmod 755 /shared/gelcontent/batch/code/license-check.sh

The intention is to run this script inside a container on a daily schedule (at 2AM). To do so, a Kubernetes CronJob can be created as follows.

 

First, the Cronjob definition should be created in new YAML template. The definition should include a command to run the license-check.sh script created earlier. The CLI command in the script uses loginviauthinfo.py to authenticate, which needs access to the config file (config.json) and authentication token (credentials.json). These will also need to be copied to the container.

 

This particular environment has a dedicated namespace, "batchns", where this CronJob will run.

 

# set variables
export current_namespace=gelcorp
export SAS_CLI_PROFILE=${current_namespace}
export SSL_CERT_FILE=~/.certs/${current_namespace}_trustedcerts.pem
export REQUESTS_CA_BUNDLE=${SSL_CERT_FILE}
NODE1FQDN=$(hostname -f)
REGISTRY_NAME=myregistry.sas.com
IMAGE_TAG=${REGISTRY_NAME}/admin-toolkit/sasadmincli4:latest
echo ${NODE1FQDN}
echo ${IMAGE_TAG}

# change to deploy directory
cd ~/project/deploy/${current_namespace}/site-config/admincli

# copy necessary files
cp ~/.sas/config.json ~/project/deploy/${current_namespace}/site-config/admincli/
cp ~/.sas/credentials.json ~/project/deploy/${current_namespace}/site-config/admincli/
cp ~/.certs//${current_namespace}_trustedcerts.pem ~/project/deploy/${current_namespace}/site-config/admincli/trustedcerts.pem

# create yaml template
tee ~/project/deploy/${current_namespace}/site-config/admincli/cronlicensecheck.yaml > /dev/null << EOF
---
apiVersion: batch/v1beta1
kind: CronJob
metadata:
  name:  cronlicensecheck
spec:
  schedule: "0 2 * * *"
  jobTemplate:
    spec:
        template:
            spec:
              containers:
              - name: cronlicensecheck
                image: ${IMAGE_TAG}
                env:
                - name: SAS_CLI_PROFILE
                  value: "${current_namespace}"
                - name: SSL_CERT_FILE
                  value: /home/sas/.certs/trustedcerts.pem
                - name: REQUESTS_CA_BUNDLE
                  value: /home/sas/.certs/trustedcerts.pem
                command: ["/bin/sh"]
                args: ["-c", "/gelcontent/batch/code/license-check.sh"]
                volumeMounts:
                  - name: sas-viya-mycode-volume
                    mountPath: /gelcontent
                  - name: sas-cli-profile
                    mountPath: /home/sas/.sas/config.json
                    subPath: config.json
                  - name: secret-volume
                    mountPath: /home/sas/.sas/credentials.json
                    subPath: credentials.json
                  - name: cert-volume
                    mountPath: /home/sas/.certs/trustedcerts.pem
                    subPath: trustedcerts.pem
              volumes:
              - name: sas-viya-mycode-volume
                nfs:
                  server: ${NODE1FQDN}
                  path: "/shared/gelcontent"
              - name: sas-cli-profile
                configMap:
                  name: cli-config
                  items:
                  - key: config.json
                    path: config.json
              - name: secret-volume
                configMap:
                  name: cli-token
                  items:
                  - key: credentials.json
                    path: credentials.json
              - name: cert-volume
                configMap:
                  name: cert-file
                  items:
                  - key: trustedcerts.pem
                    path: trustedcerts.pem
              restartPolicy: Never
EOF

 

A kustomization.yaml file is also required to include the CronJob definition in the manifest that will be applied to the cluster.

 

tee ~/project/deploy/${current_namespace}/site-config/admincli/kustomization.yaml > /dev/null << EOF
---
namespace: batchns
generatorOptions:
  disableNameSuffixHash: true
resources:
  - cronlicensecheck.yaml
configMapGenerator:
  - name: cli-config
    files:
      - config.json

  - name: cli-token
    files:
      - credentials.json

  - name: cert-file
    files:
      - trustedcerts.pem
EOF

 

The manifest can then be generated and applied.

 

cd ~/project/deploy/${current_namespace}/site-config/admincli
kustomize build -o ~/project/deploy/${current_namespace}/site-config/admincli/cronlicensecheck_job.yaml
kubectl -n batchns apply -f ~/project/deploy/${current_namespace}/site-config/admincli/cronlicensecheck_job.yaml

The CronJob will appear in the batchns namespace:

 

kubectl get cronjob -n batchns

 

NAME               SCHEDULE    SUSPEND   ACTIVE   LAST SCHEDULE   AGE
cronlicensecheck   0 2 * * *   False     0                  52m

 

The CronJob is scheduled to run once a day. Create a ad-hoc job from the CronJob to run it immdiately:

 

kubectl -n batchns create job --from=cronjob/cronlicensecheck license-check-adhoc

 

The job launches a pod to run the script. A peek at the pod log provides some details about what happened when it ran.

kubectl -n batchns logs license-check-adhoc-zvdkc

 

Checking license...
SAS license expiry date:  2021-10-28
Today's date:  2021-10-07
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100   370  100    20  100   350    769  13461 --:--:-- --:--:-- --:--:-- 14230
{"status":"success"}

 

The "success" message is a result of the cURL command, and it indicates the request to trigger the alert was made successfully (i.e. the expiry date is less than 30 days away).

 

Verify by logging on to the Alertmanager UI. The alert will appear (with labels and attributes defined earlier) in the list of firing alerts.

 

af_1_license_alert.png

Select any image to see a larger version.
Mobile users: To view the images, select the "Full" version at the bottom of the page.

 

Typically, with out-of-the-box metric alerts, clicking on the Source button opens the Prometheus UI. For this alert, the Source button instead opens SAS Environment Manager's Licenses page, as directed by the value of the generatorURL property defined in the script's cURL command. Apart from that, the firing alert can be queried, grouped or silenced just like any other firing alert.

 

Note that alerts created using the v1 scheme of the Alertmanager API don't change state until explicitly instructed to. Consequently, an alert created using the process outlined here will need to be manually resolved. That can be done with a second cURL command that sets the value of the status label on the alert to "resolved". A link to a script that runs the command (or documentation with resolution steps) could be added to the runbook attribute of the alert, making it appear as a clickable link in the firing alert (as shown above).

 

The final piece of the puzzle is the routing, which determines how firing alerts are sent to the notification channels. Read my previous article to learn more about routing.

 

The example provided in this post can be used a starting point for alerting on other aspects of a Viya platform. Consider, for example, an alert that fires when services inside Viya pods become unavailable, or when critical data becomes inaccessible; administrators can benefit by adding these kinds of custom alerts to add to the collection of metric alerts that are included out-of-the-box with SAS Viya Monitoring for Kubernetes.

 

For additional information, please refer to the Alertmanager documentation page.

 

Thanks for reading.

 

 

Find more articles from SAS Global Enablement and Learning here.

Version history
Last update:
‎11-03-2021 07:48 PM
Updated by:
Contributors

SAS INNOVATE 2024

Innovate_SAS_Blue.png

Registration is open! SAS is returning to Vegas for an AI and analytics experience like no other! Whether you're an executive, manager, end user or SAS partner, SAS Innovate is designed for everyone on your team. Register for just $495 by 12/31/2023.

If you are interested in speaking, there is still time to submit a session idea. More details are posted on the website. 

Register now!

Free course: Data Literacy Essentials

Data Literacy is for all, even absolute beginners. Jump on board with this free e-learning  and boost your career prospects.

Get Started

Article Tags