As Rick Styll pointed out in his recent post announcing the availability of SAS Visual Analytics (VA) 8.2, there are just over 100 enhancements in this latest release. Some are obvious, like the new and improved objects. Others are subtle, like some of the enhancements and refinements to the user interface. And then there are the new calculations. Where you need to roll up your sleeves and explore what they are and how you can leverage them in future visualizations. In this blog I want to explore a new 1-click calculation that is as versatile as it is powerful: cumulative total.
Definition from the VA 8.2 user guide – “Cumulative total displays a running total of all of the values for the measure on which it is based. For example, in a list table, each cell for the cumulative total displays the sum of the current value and all of the previous values, according to the sorted order of the rows in the list table.”
Well that seems easy enough. The versatility comes from what you can do with it. If you need to create running totals or moving averages in your reports, then cumulative totals might just be your new best friend in SAS VA 8.2. Let’s walk through a few examples.
Below is a very common report where I created a line chart showing the value of a selected measure over time. In this case I am showing product sale by month. Very simple. I also added a list table showing the same values.
You can easily see spikes in the data, which shows the seasonality of product sales. But what if I wanted to see the cumulative value of this measure over time, without regard to a fiscal or calendar year? You can easily create a 1-click cumulative total calculation. Right click on your measure and select cumulative total as your type.
Let’s drag our new calculation into both objects and see what it looks like.
Both the line chart and the list table show the values steadily climbing over time, without regard to any time interval. If you were interested in having the cumulative total reset at a common time interval like year, you could use the cumulative period calculation (a different aggregated-periodic calculation), which can easily give you totals such as year to date of monthly values. But that’s not what we’re after in this example. We’re looking at the cumulative total over the entire time horizon, starting with the first value through to the last.
By adding a filter to the report, these charts could become much more interactive. I’ll add a slider of transaction month to the page control area, so it will automatically filter both objects based on my time selection.
After changing the time horizon, the cumulative total now starts at the beginning of the new date range. Without changing the formula, I am able to reset the cumulative total based on an interactive filter. This is a simple yet effective way to get a running total dynamically over any selected time horizon. I could look at the last 12 months, the last 24 months, lifetime to date, etc.
So what is happening under the covers when we create this 1-click calculation? Let’s take a quick look at the formula editor.
The 1-click calculation for cumulative total is using the AggregateCells function. There are a few options you can change, such as the aggregation type (average, count, max, min, sum), the direction of the calculation (row or column) and where does your query start (with an offset if you choose), etc.
To get a better understanding of the AggregateCells function, let’s create a new calculation from scratch. This time I will sum the value of 1 for every line of the query, instead of basing it on a measure in the data. I’ll start this at the first line of the query.
If I add this to a report, I will end up with a row number. In addition to showing you the mechanics of how this calculation works, you might find times where row number helps in the navigation of a list table.
Keep in mind that this is a cumulative total is based on the sort order of the query. If I were to resort the list table not by date, but by product sale in descending order, then the rows would be reordered. The row number would still begin at 1, at the first row, but rows are now sorted based on another criteria.
So where else might you use this type of calculation? Maybe you want to sort by product line, and see an accumulation of sales as you add additional product lines to the report. Or maybe you need to show cumulative totals over multiple years, for example in liquidity risk reporting. Or maybe you want to show the lifetime sales of a product, or sales over a specific duration. The point is that you can create a cumulative total not just across time, but across any dimension.
Creating a moving average is another great new addition to the growing list of 1-click calculations. While creating a moving average calculation was possible in previous releases of VA, it was not the simple 1-click calculation it is today in VA 8.2.
In this next example, we return to our simple line chart of product sales by month.
I will right click over my measure (product sales) and select moving average from the list. This is also an opportunity for me to show you all the other 1-click calculations available when right clicking on a measure.
When you select moving average, the default for number of cells to average is 5. We will select this value for our example.
That’s it. Adding the new moving average calculation to the line chart produces the following result.
When I edit the moving average calculation, you can see how it calculated. While similar to the above row number calculation, it does add a bit more complexity.
The default value was 5, so the calculation is the average of product sale, for the current period minus 4 (equals the default of 5). Much easier than previous releases of VA, even if you created this one from scratch.
Since the Super Bowl is right around the corner, I decided to build the last report example showing Super Bowl winning teams. For those located outside the US, the Super Bowl is the annual championship game for the NFL (American football). To date there have been 51 Super Bowl winners, which are displayed in the following pareto chart.
This chart quickly highlights which teams have won the most Super Bowls. All NFL teams are along the x axis, represented as bars in descending order based on the number of wins. The line displays the combined total of of all previous bars as a percentage (cumulative total) until it reaches the final bar (100%). I also added a background display rule (yellow background) at 75%.
Since the line is a cumulative percent of total wins, the intersection of the line and display rule criteria (shown by the circle marker above Washington), shows the point where 75% of all Super Bowl wins are accounted for (well technically 74.51%). Another way to say that is 9 teams account for 75% of all Super Bowl wins. It's not exactly the 80/20 rule we often hear about, but it's pretty close at 75/28.
The other thing you’ll notice is that 13 teams (41% of number of teams) have no Super Bowl wins. The cumulative total reached 100% after Tampa Bay so the last 13 teams do not add to the cumulative total.
It is a straight forward exercise to create the above Pareto chart with the dual axis bar-line chart object and start exploring the cumulative totals functionality now available in VA 8.2. Below is the calculation for “Wins (cumulative % of total)”, which was the ascending line in the above chart. Once again it is using the AggregateCells function to accumulate the wins, but this time dividing by the total number of wins (game played). I have also included a small Excel file of Super Bowl winners if you want to start playing around with this functionality yourself.
As you start to discover all the new enhancements in VA 8.2, take some time to look at the new calculations. In this blog I showcased the cumulative totals, which can work across any dimension, not just time. They can be used in many different use cases and open up a whole new world of possibilities.
Take a 1-click calculation for a test drive and even edit the results. Build one of these calculations from scratch. Just explore the new calculations, add them to a visualization and see what they can do for you. Happy visualizing!
Registration is open! SAS is returning to Vegas for an AI and analytics experience like no other! Whether you're an executive, manager, end user or SAS partner, SAS Innovate is designed for everyone on your team. Register for just $495 by 12/31/2023.
If you are interested in speaking, there is still time to submit a session idea. More details are posted on the website.
Data Literacy is for all, even absolute beginners. Jump on board with this free e-learning and boost your career prospects.