Identifying Anomalous Frequencies in Signal Data with Short-Time Fourier Transform using SAS/IML

The purpose of this post is to learn how digital signal processing tools can be used to identify anomalous frequencies in sensor data. We start by learning about digital signals and the Fourier Transform before introducing the short-time Fourier Transform (STFT). We then use the SAS Interactive Matrix Language (SAS/IML) to simulate a digital signal with unwanted high-frequency noise and use the STFT to identify the presence of this unwanted noise.

Introduction to Analog and Digital Signals

Sensors can provide useful real-time information about the condition or quality of devices or components. These sensors could measure simple scalar properties like voltage in a circuit or temperature in a kiln, but they could also measure more complex vector quantities like the wind speed and direction at a manufacturing facility. In both cases the sensor will output data that effectively looks and acts like time series data (each observation output by the sensor will have a value for when it is output and a scalar or vector value for the measure of interest), but it is more useful to think of this sensor data as a digital signal instead of treating it as a time series.

Digital signals are sampled versions of continuous analog signals. Analog signals are continuous functions of time and cannot be represented in a computer. A very simple example of an analog signal would be the voltage output from an ideal alternating current (AC) outlet as measured by an analog oscilloscope:

Select any image to see a larger version.
Mobile users: If you do not see this image, scroll to the bottom of the page and select the "Full" version of this post.

Some important properties of periodic signals like this are the amplitude of the signal, the frequency of the signal, and the phase of the signal. The amplitude defines the height of the peaks in the signal, while the frequency is the inverse of the time between the peaks (the time between the peaks is called the period of the signal). Of course, most real signals are not just simple sine waves like the ideal AC outlet and often contain multiple components with different frequencies and amplitudes.

Although a sensor device may measure the voltage in a circuit every 0.5 seconds and output a series of numbers corresponding to the voltage over time, the actual voltage over time is a continuous function of time instead of a collection of discrete samples. The process of sampling the circuit every 0.5 seconds turns the analog signal into a digital signal; if this process is done carefully the digital signal will represent the phenomenon of interest in the analog signal. The key idea is that the sampling frequency (i.e., the frequency with which we take digital samples from the underlying analog signal) sets an upper limit on the frequencies in the original analog signal that can be identified in the sampled digital signal. The Nyquist frequency is equal to half the sampling frequency and is the largest frequency that can be represented correctly by the digital signal. This means when sampling analog signals, we must choose the sampling frequency to be at least twice as large as the largest frequency in the signal, else we will lose important information moving from the analog signal to the digital signal.

In picture above the orange dots are the samples in the digital signal while the blue line shows the original analog signal. A good sampling frequency like the one above successfully captures a representation of the analog signal with the samples.

Now in the picture above we try to use a lower sampling frequency with the same signal. This time our equally spaced sampling points always correspond to zeros in the analog signal. The digital signal sampled here will just be a collection of samples all with an amplitude of zero, even though the analog signal was not just constantly zero. A bad sampling frequency yields unclear results and fails to capture the information present in the original analog signal.

Introduction to Fourier Transforms

Since most real-world signals contain multiple components with different frequencies and amplitudes, just looking at the numeric values of the digital signal won’t provide much insight or understanding about the device that generated the signal. It is often more useful to break down the signal into its different frequency components and then analyze those components to either gain an understanding about the device, or to create inputs for predictive models. This conversion from the time domain to the frequency domain can be accomplish using the Fourier Transform, a mathematical approach to turn a signal as a function of time into a signal as a function of frequency. When working with digital signals we use a Discrete Fourier Transform (DFT) since we are operating on discrete samples, and it turns out there is an incredibly efficient algorithm for calculating DFTs called the Fast Fourier Transform (FFT). The FFT is one of the most important algorithms of the past century and enabled a lot of the functionality we are used to with modern digital computers and communication devices. The fundamental idea behind the Fourier transform is best illustrated in pictures:

This is a plot of digital signal in the time domain: the x-axis is time in seconds, and the y-axis is amplitude. This signal only contains a single frequency component, and it looks like the time between peaks is about 0.2 seconds, so the frequency would be (0.2)^-1 = 5 Hz. We can see this by looking at a plot of the Fourier transform of this signal (calculated using the FFT algorithm):

This is a plot of the digital signal in the frequency domain: the x-axis is frequency in Hertz (inverse seconds) and the y-axis is amplitude. Although the peak is not a perfect vertical line (spectral leakage and limited frequency resolution leads to non-ideal results, but we won’t get too detailed in this blog post), we can see that there is a peak at 5 Hz, matching what we expected from looking at the signal in the time domain. This simple example illustrates the basics of the output we get from Fourier transforms, but we don’t gain much from looking at this simple signal in the frequency domain. The story is changed when we have multiple different frequency components in the signal.

In this example our signal has two different frequency components, with short peaks mixed in with taller peaks in the time domain. Knowing our signal has two frequency components we can probably estimate the different frequencies by looking at the time between the peaks, but it is much easier to interpret the signal contents when we take a Fourier transform and move to the frequency domain:

Here we can clearly see that there is a small peak at 5 Hz, and a larger peak at 10 Hz. Real-world signals will contain many different frequency components and so the Fourier transform of these signals will have many peaks in the frequency domain, some of which may even overlap.

Introduction to Time-Frequency Analysis and the Short-Time Fourier Transform

The Fourier transform makes it easy to analyze the components of signals with constant frequency, but if the frequency changes during the signal, the Fourier transform will not tell us when this change occurs (or even that it occurs at all), instead it will just output all the frequencies in the signal at any time in the signal duration. Basically, the Fourier transform is time-independent; if we want to understand the time-dependence of frequencies in the signal we must use a different approach. This is best illustrated by looking at the Fourier transforms for two very different signals with the same basic frequency components:

This first signal starts with a 5 Hz sinusoid and after 1 second the frequency increases to 15 Hz. When we take the Fourier transform of the signal using the FFT algorithm, we see two peaks in the frequency domain, one for the 5 Hz portion of the signal and one for the 15 Hz portion of the signal. Although this tells that the signal has two different frequency components, it gives us no information about the timing, so we are not able to see that the signal only has one frequency at a time, it just changes after 1 second.

This second signal is just a linear combination of a 5 Hz sinusoid and a 15 Hz sinusoid. There is no time dependence in the signal and the signal always has these two frequency components. The Fourier transform basically shows the same things as the previous example, a peak at 5 Hz and a peak at 15 Hz. In these examples we have simple signals, so we can look at the time domain to distinguish between the signal with the time-dependent frequency and the signal with the time-independent frequencies, but the frequency domains for these two signals look the same. When working with more complex real-world signals we won’t find it easy to interpret frequency changes in the time domain, and using the Fourier transform to move to the frequency domain won’t help in identifying timing information.

The solution to our problem is to perform a sequence of windowed Fourier transforms, with the windows each covering a small portion of time, so that we can plot a sequence of Fourier transforms over time to uncover how the frequencies change. This approach is called a Short-Time Fourier transform (STFT) and the result is a collection of individual Fourier transforms for different time windows of the original signal.

Instead of performing a Fourier transform on the entire signal, we break it into two different windows, performing a Fourier transform on the first window from 0-1 second and then a performing a second Fourier transform on the window from 1-2 seconds. Each of these transforms will yield a frequency spectrum, which can then be plotted in sequence. This approach reveals that the signal only contains a 5 Hz component from 0-1 second and only contains a 15 Hz component from 1-2 seconds.

More realistic STFT calculations involve many more windows and are often plotted using a two-dimensional heatmap, but the fundamental idea is the same as in this toy example. We can access FFT and STFT functionality in a couple of different ways using SAS software. The SAS Interactive Matrix Language (SAS/IML) contains a variety of easy-to-use digital signal processing tools, including a routine to calculate FFTs and STFTs. The TSMODEL procedure in SAS Visual Forecasting (SAS VF) also includes a Time-Frequency Analysis (TFA) package with objects for performing FFTs and STFTs. Finally, SAS Event Stream Processing (SAS/ESP) includes the STFT as an online algorithm that can be applied to windowed streaming data. Let’s explore an example of plotting the results of a STFT using SAS/IML. In this context SAS/IML is most useful for exploring digital signals and prototyping workflows, SAS VF is most useful for analyzing traditional time series using digital signal processing tools, and SAS/ESP is most useful for deploying digital signal processing workflows.

Spectrogram STFT Example using SAS/IML

To illustrate the use of STFT for anomaly detection we simulate a simple sinusoidal signal with a short period of injected low-amplitude, high-frequency noise in the middle of the signal duration. To be specific, we simulate a 30 Hz signal for 1000 time points (with a sampling frequency of 500 Hz this corresponds to a signal with a 2 second duration), and we inject 50 samples of 200 Hz noise for a duration of 50 samples (0.1 seconds). This noise is injected with an amplitude of 5% the original signal amplitude, so it doesn’t make a noticeable change to the time-domain plot of the signal, but its presence becomes noticeable in the time-frequency domain. This example is generic by intention to illustrate the core concepts of STFT, but we could imagine this signal being a voltage measurement from a circuit that has temporarily been coupled to a source of high-frequency noise, or a sound/vibration measurement from a turbine device with a damaged blade that scrapes against a surface briefly during the operation. Either way, our example is a situation where we ideally expect a consistent 30 Hz signal to be output from our device, but some anomalous event has injected unwanted high-frequency noise in our signal, and we want to identify when this happens so we can potentially remove it.

We start in SAS/IML by defining the sampling frequency and establishing an array of time values for the simulated signal. The detailed code below to create the data is included as a reference for interested readers (and to make it easier to reproduce the demonstration). If you are not as comfortable with SAS/IML syntax, the main focus of this blog is on using the Spectrogram subroutine to visualize the frequency spectrum. The code to plot the spectrogram is the last block of code after the plot of the input signal, and although it is still part of SAS/IML, the syntax is fairly simple.

proc iml;
 
	f_s = 500; /*sampling frequency*/
	f_s2 = f_s/2; /*Nyquist frequency*/
 
	t_s = 1/f_s; /*sampling period (time between samples)*/
	N_samp = 1000; /*number of samples*/
 
	/*array of times (first half)*/
	t1 = do(0, (N_samp/2-1)*t_s, t_s); 
	/*array of times (section with high-frequency noise)*/
	t2 = do((N_samp/2)*t_s, (N_samp/2+50)*t_s, t_s); 
	/*array of times (second half)*/
    t3 = do((N_samp/2+51)*t_s, (N_samp-1)*t_s, t_s);

The sampling frequency is 500 Hz, so the largest frequency we can identify in the digital signal will be 250 Hz. The arrays t1 and t3 will be used to simulate the normal signal, while the 50-sample array t2 will be used to simulate the signal with the high-frequency noise injected.

    f1 = 30;
	f2 = 200;
	pi = constant("pi");

	/*signal for the first half*/
	x1 = sin(2*pi*f1*t1);
 	/*signal corrupted by a burst of high-frequency noise*/
    x2 = sin(2*pi*f1*t2) + 0.05*sin(2*pi*f2*t2); 
	/*signal for the second half*/
	x3 = sin(2*pi*f1*t3); 

	/*combined time from N=0 to N=2000*/
	t = t1||t2||t3; 
	t = t*1000; /*convert time to milliseconds*/
	/*combined signal with a burst of high-frequency noise*/
	x_in = x1||x2||x3;

	create work.input_signal var {"t","x_in"};
	append;
	close work.input_signal;

The signals x1 and x3 are the normal signal with frequency 30 Hz, while the signal x2 is the short duration with the 5% amplitude high-frequency noise injected into the signal. We can see that x2 is just a linear combination of the original normal signal with the low-amplitude noise. After defining the 3 different time arrays and 3 different signal arrays, we concatenate them in SAS/IML to create the overall time array t and the overall signal array x_in. We write these arrays to a SAS dataset called work.input_signal so we can plot them using SGPLOT.

    /*plot our signal in the time domain*/
	submit;
	title "Input Digital Signal";
	proc sgplot data=work.input_signal;
		 series x=t y=x_in;
		 xaxis grid label="Time (ms)";
		 yaxis grid label="Amplitude";
	run;
	endsubmit;

Plotting the signal as a function of time we can see there looks like there is some distortion around 1000 ms in the signal, but it’s unclear exactly what is happening in the signal, and it doesn’t look like there is a 200 Hz component added into the signal. Visually, the frequency of the signal in the above plot looks mostly constant; of course, visually inspecting the plot of a signal is not a good way to determine the frequency components in the signal. Let’s use the STFT to plot a spectrogram.

    /*The window length will have a big impact on the time/frequency resolution*/
	winLen = 25; 
	window = tfwindow(winLen, "Hanning");
	noverlap = 10;
	/*increasing nfft zero-pads the windows and increases the number of frequency bins*/
	nfft = 64; 
	/* reverse "spectral" color ramp */
	ramp = palette("Spectral",7)[7:1]; 
	title "STFT";
	call spectrogram(x_in,window,noverlap,nfft) scale="log"
                 		    colorramp=ramp sampRate=f_s panel=1;
quit;

Before we look at the spectrogram output lets briefly discuss some of the important STFT options used in this example:

Window Length – the window length (winLen = 25) determines how many time samples are included in each time window of the STFT. Longer window lengths mean more samples per window and thus higher frequency resolution. Shorter window lengths mean more windows overall given a fixed signal duration and thus higher time resolution. There is always a fundamental trade-off in resolution between time and frequency when performing the STFT.
Window Type – the window function choice affects spectral leakage across the frequency bins. In this example we use a Hann (often called Hanning) window function for each segment. The basic idea with windowed transforms is to eliminate negative effects from using a rectangular window (which is used by default if another window function is not chosen). The details of how and why this works are outside the scope of this blog, but spectral leakage and window functions are an important topic in digital signal processing.
FFT Length – The number of FFT samples (nfft = 64) is the number of points used by the FFT algorithm to calculate the frequency spectrum for each time window in the STFT. This number must be greater than or equal to the window length; if nfft = winLen then no zero-padding will occur and the FFT will be performed on the windowed signal. If nfft > winLen then the windowed signal will be padded with zeros until it has the length specified by nfft. These added zeros help by ensuring we can use efficient FFT algorithm (by ensuring that the number of samples is a power of 2), but they also help by increasing the number of frequency bins in the FFT output. This doesn’t fundamentally increase the frequency resolution of the STFT since the output will be interpolated across the frequency bins, but it does improve the plotting of the results.
Window Overlap – the overall between windows (noverlap = 10) determines how many samples overlap between subsequent windows in the STFT analysis. Choosing no overlap would mean that information near the boundaries of each window is lost, while choosing a very large overlap would mean a lot of redundant information is included in each window, increasing processing time unnecessarily. In general, a good practice is to choose the overlap to be around 50% of the window size. In this example we use an overlap of 10 with a window size of 25, so our overlap is 40% of the window size.

We use the call spectrogram() routine in SAS/IML to calculate the STFT and plot it in a two-dimensional heatmap, all in one function call. We provide the input signal x_in, the STFT settings described above, the sampling frequency for the underlying signal so we can output frequency in Hertz, and a scale for the power in each frequency bin. We use the log scale (the decibel scale is also useful here) since the power values outside of the 30 Hz signal are hard to distinguish in the linear scale. The option “panel=1” is chosen so that we output a panel of results, including the time domain representation of the signal, the time-frequency heatmap, and an averaged view of the frequency spectrum across all time.

Looking at the output of the spectrogram, we can see the 30 Hz component contains most the power in the heatmap. On a linear scale this would dominate all the other frequency components, making it hard to see lower power high-frequency components in the plot. On the log scale we can see there is non-negligible power around the 150-250 Hz range between Time = 1.0 seconds and Time = 1.1 seconds. The orange smear around 200 Hz corresponds to the injection of the high-frequency noise into the signal, and the spectrogram provides us detailed information about the timing of when the noise appeared in the signal, along with the frequency of the injected noise. Fundamental limitations associated with the STFT still apply, so we don’t perfectly localize the noise in time and frequency. It looks like the high-frequency noise is spread out between 175 and 225 Hz (due to spectral leakage), and the pixelation of the spectrogram clearly illustrates the limited resolution in time and frequency. We could improve the frequency resolution by using longer windows (and thus fewer windows) in the STFT, but this would degrade the time resolution, making it more difficult to identify when the noise was injected. Here is an example of changing some settings as follows:

winLen -> 250
noverlap -> 100
nfft -> 256

Here we can see that the frequency bands are much narrower, although we still have some spread around the true frequencies of 30 Hz and 200 Hz. The tradeoff is in the time resolution, it looks like the high-frequency noise was added around 0.75 seconds and didn’t end until about 1.25 seconds. This loss in time resolution makes sense since we used longer windows, so the high-frequency noise appears (with at least a little bit of power) in all the time windows between 0.75 and 1.25 seconds.

Anomaly Detection Using STFT

We can use the STFT to perform anomaly detection by collecting normal signals (signals that we are confident do not contain anomalies) and looking at their frequency content. We plot the spectrogram of the non-anomalous data and identify major frequency components and when they occur in the signal. This characterizes the normal data, which we will use as a baseline for judging whether signals contain anomalous frequencies. We could then perform a real-time STFT on all new signals and compare the frequencies we see in the spectrogram with those from the normal FFT. Any extra frequency components we identify in the new signals would correspond to anomalies in the data (assuming we did an adequate job of characterizing the normal data signals). This is a kind of unsupervised anomaly detection where we don’t have to collect data representing the anomalies, we only collect data to represent the normal operating conditions of the system that generates the signal.

Continuing with the example of a 30 Hz signal with 200 Hz noise injected, our normal signal would be just the 30 Hz signal with no 200 Hz noise.

The spectrogram of the normal operating conditions reveals that most of the power is concentrated around 30 Hz. To be more specific, we can identify that the normal operating conditions have no significant frequency components above 100 Hz (the 30 Hz signal leaks spectral information around the true frequency), so anytime we see significant power above 100 Hz we should identify it as an anomaly. Of course, we must be specific about what significant power means, so we are looking for signals with the log of the power greater than about -4 in frequency ranges above 100 Hz. Since we want the log of power to be greater than -4, on the linear scale we want power > 0.018; this can be used to threshold anomalies when we go to evaluate new signals for anomalies using the spectrogram.

This approach for detecting anomalies can be done entirely using SAS/IML using the digital signal processing tools, but we could also use the SAS Visual Forecasting tools in the Time Frequency Analysis Package (TFA) in PROC TSMODEL to perform the STFT to detect anomalies. SAS/IML and the TFA package are great for offline analysis of digital signals, but to perform real-time scoring of anomalies we would need to use the Calculate Window with the STFT algorithm SAS Event Stream Processing. A subsequent blog post will discuss deploying anomaly detection models in SAS/ESP.

References/Useful Links:

This is a great introductory article talking about Fourier analysis and the FFT algorithm

Find more articles from SAS Global Enablement and Learning here.