Extract and visualize streaming data from an array with SAS Event Stream Processing

Welcome to this 3rd post in the series on the Mahalanobis-Taguchi system (MTS).

The first post introduced the new procedures: MTS and MTSSCORE. The MTS procedure creates a model representing a system while it is operating in error-free conditions. The model is built using historic data at rest. Proc MTS saves the model as an astore (analytic store) binary file for ease of deployment.

Proc MTSSCORE uses the astore file to score data (events) collected from the current running system. MTSSCORE output 1) identifes when the system operates outside the bounds established in the error-free condition and 2) assigns a ‘gains’ value for each numeric input of the system for each event. Variables with a high ‘gains’ score contribute to the system entering a “fault” condition. This information aids root-cause analysis of a system that may be headed towards eventual failure.

Primary Objective

I wanted to see if I could use Grafana to create a meaningful visualization from streaming gains values in SAS Event Stream Processing. The following diagnostic gains chart rendered by the MTSSCORE procedure as described in the first post is the inspiration for my quest. In this chart, the latest event appears at the bottom, and the "important" variables with the highest gains appear at the left.

Select any image to see a larger version.
Mobile users: To view the images, select the "Full" version at the bottom of the page.

The 2nd post in the series deploys the MTS astore model in SAS Event Stream Processing Studio using the SAS Event Stream Processing Calculate window that now supports the MTS scoring process.

The Calculate window generates a gains value for each input variable for each event, and places it in an array. The challenge is that data in an array structure is not suitable for visualization in Grafana, the graphical product used by SAS Event Stream Processing. For those of you that are new to working with data stored in arrays, read on to learn more.

For those accustomed to working with arrays, you may choose to skip ahead to the Grafana Visualization section to see how closely my SAS Event Stream Processing/Grafana plot resembles the MTSSCORE plot.

Extract values from an array

Building on the SAS Event Stream Processing project from post 2, I will add code to a Lua window that converts the gains array back into individual gains columns so I could continue to work on the visualization.

Why would anyone want to use Lua instead of Python?

Well, that totally depends on you. Many data handling capabilities can be done in either language.

The Lua language is described as lightweight. It is used in game development and is also well suited for working with data in applications like SAS Event Stream Processing. It is fast and provides many facilities for table manipulations, math operations, xml and json string management and its syntax doesn’t rely on indentation. It is also easy for Python programmers to pick up. With SAS Event Stream Processing, you can use both Lua and Python windows in a project.

These are the windows used in my SAS Event Stream Processing project:

In the SAS Event Stream Processing project, the Calculate window stores the computed gainList value for each model input in array format. The LuaArray and PythonArray windows extract individual values from an array. The FilterEngine123 window returns values for a specific engine for plotting and exception monitoring.

This output from the Calculate window output shows Mahalanobis distance values, outliers and the gainList array.

Lua code used to expand the array

Before creating a graph, we need to assign each computed gains value in the gainList array to an appropriate variable name. To do this, I first specified each variable name in the same order as the source window’s schema in a table I called varnames. I could have added a prefix like “gain” to each name to differentiate the Lua window output from the Source window input variables, but since only the gainList array is passed to the Lua window, I decided to use the same name for both windows.

This is the Lua code. The included comments starting with "--" describe the Lua code in the screen capture below.

In the screen capture below from the Lua Window the output shows the gainList array and the individual variable names with the extracted values. Notice that datetime represents the time series of events, and there is an engine identifier that is used in the Filter window to create a graph for a specific engine.

Lua window results showing individual gain columns ready for visualization.

Here is the equivalent Python code to extract values in the array to the varnames columns. I chose to add the ‘gain’ prefix to each variable name for this window’s output to explicitly differentiate the output variables from the variables in the source window.

Grafana Visualization

With the SAS Event Stream Processing project processing streaming data, let’s use Grafana to try and visualize the gains. The objective is to identify variables with high gains that cause potential error conditions.

This is the original MTS Monitoring chart followed by the MTS diagnostic gains chart produced by proc MTSSCORE from my first post in this series. The variables with the highest gain value appear to the left in the second chart which is the one I want to recreate.

For the Grafana MTS Diagnostic Gains chart from SAS Event Stream Processing, I chose similar blue/yellow/red colors as the proc MTSSCORE chart above. Notice how the color gradients in both legends are similar. This plots the time series moving from left to right whereas the chart above plots the time series moving from top to bottom. Variables with dark blue have low gains while light blue, orange and red have high gains. The results of both visualizations match!

If you would like to build this chart, select the Heatmap visualization in Grafana. For other properties, choose the RdYlBu color scheme, set the steps slider property to 30 to blend the colors in the legend, set the start color scale to -0.2 and the end color scale to 8.

Proc MTSSCORE displays the variables from left to right by decreasing gain value while Grafana orders the variables alphabetically from bottom to top. Even with this difference, it is still easy to spot the red, orange and yellow sections of the visualization to find important variables such as TotBypassPressure and FuelPressureRatio.

I decided to re-create the plot using a red, yellow, and green traffic lighting scheme and placed them side by side below for comparison.

I chose the RdYlGn color scheme, set the steps slider property to 8 to have fewer gradients in the legend, set the start color scale to 0 and the end color scale to 6.

I like the green/yellow/red color version with fewer “steps”. It provides a “choppier” legend, produces fewer colors in the chart, but makes it easy to identify variables that are experiencing high gains values. In either case, with Grafana, you can select colors and legend layouts that are meaningful to you.

Finally, I re-configured the Y Axis label placement (to the right) and rotated the Grafana chart 90 degrees for this screen capture to make it easier for anyone wanting to compare the two charts. Even without sorting the variables according to gains value, it is still easy to select the variables with the highest gains by color.

Grafana provides useful visualizations for testing SAS Event Stream Processing projects as shown in this series.

I am grateful to Tom Tuning for his help with using Lua to process arrays in SAS Event Stream Processing for Grafana. You can see a related post from Tom here: Fun-with-Lua-and-SAS-Event-Stream-Processing

Mission accomplished.

Thanks for reading!

Find more articles from SAS Global Enablement and Learning here.

Extract and visualize streaming data from an array with SAS Event Stream Processing

Registration is open

SAS AI and Machine Learning Courses