SAS Visual Analytics Advanced Calculations (part 1 of 4): AggregateCells

13 Likes

If you spend too much time engineering a solution for applying your complex business rules into a calculation in SAS® Visual Analytics, or feel it can’t be done, or simply want to learn something new and powerful about SAS® Visual Analytics calculations, this article is for you!

SAS® Visual Analytics provides an extensive list of operators that can be used in expressions, but some types of calculations may become very challenging, and defy your ability to solve a few problems, independently of your industry or business area. SAS® Visual Analytics 8.3 brings a series of new enhancements that together are so powerful that they allow you to go above and beyond the capacity of delivering the insights that you need. Those capabilities are: aggregated data sources, data source join, AggregateTable operator, and AggregateCells operator (this last one available since VA 8.2).

This article kicks off a series to explain each one of them separately and the last one guides you through an example that leverages them to deliver the expected results for a specific Per Member Per Month (PMPM) use case. Did you get stuck in the past trying to produce reports with non-trivial calculations? This is your chance to rethink them.

We are going to start with AggregateCells operator.

AggregateCells

The AggregateCells operator aggregates values of a specific set (or range) of cells for an object. It can be used to calculate cumulative totals, running average, etc.

In SAS Visual Analytics 8.3, the AggregateCells operator can be found under the Aggregated (tabular) group, on the left side of the expression window, when creating a new calculated item:

01-AggregateCells operator

In SAS Visual Analytics 8.2, this operator is found under the group called Aggregated (advanced).

The AggregateCells operator has a few parameters, as indicated in the picture below:

02-AggregateCells parameters

The aggregation that is applied to the measure
1. Average
2. Count
3. Max
4. Min
5. Sum
The measure to aggregate
The direction in which the values should be aggregated.
1. Column
2. Row
3. Default
The starting point for the aggregation (that can be an offset from that point)
1. Current
2. Start
3. End
The ending point for the aggregation (that can be an offset from that point)
1. Current
2. Start
3. End

Some of those parameters may need a bit of additional explanation, like the direction of the aggregation and the range (starting and ending point) of the aggregation.

Direction – parameter #3

According to the documentation, the default option specifies that the direction is selected automatically (column or row) based on the contents of each object. Trying to map columns and rows into a line chart for example is not that straight forward, so I like to think of them as vertical and horizontal respectively. In the case of a line chart, the default is the same as row (or horizontal), as the category role is laid down across the X axis.

Normally the default option should work as expected for most of the objects, but there might be cases where you need to change the direction, especially with crosstab objects, where the definitions of rows and columns are clear, and it might make sense to aggregate in a different direction. For example:

If a crosstab has categories across *rows only*, the default option is the same as column (vertical aggregation), independently if measures are in rows or columns:

03-Aggregation direction - example 1

If a crosstab has categories across *columns only*, the default option is the same as row (horizontal aggregation), independently if measures are in rows or columns:

04-Aggregation direction - example 2

If a crosstab has categories across *rows and columns* and measures are in columns, the default option is the same as column (vertical aggregation):

05-Aggregation direction - example 3

If a crosstab has categories across *rows and columns* and measures are in rows, the default option is the same as row (horizontal aggregation):

06-Aggregation direction - example 4

Starting and Ending Point of the Aggregation and Offset – parameters #3 and #4

These parameters are the ones that define the set of cells to be aggregated. The option *start* specifies the first value in the row or column. The option *end* specifies the last value in the row or column. The option *current* specifies the current value being processed in the row or column. The offset just moves the pointer around the value:

07-Aggregation range and offset

Example 1 – Cumulative Total

All previous pictures used to explain the parameter direction are calculating cumulative total. To implement cumulative total, you just choose Sum as the aggregation, then make the starting point equals to *start* with offset zero, and the ending point *current* with offset zero:

08-Cumulative total with AggregateCells

These are the charts for Order Quantity and its cumulative total:

09-Cumulative total

Example 2 – Line Numbers

If you replace the measure Order Quantity in the previous example with a fixed number 1, the expression generates sequential numbers that can be used as line numbers:

10-Line numbers with AggregateCells

11-Line numbers

Example 3 – Moving Average

To implement moving average, let’s say for a window of 7 days, you first make the aggregation equals to *Average*, then the starting point *current* with offset -6, and ending point *current* with offset zero:

12-Moving average with AggregateCells

13-Moving average

Both Cumulative Total and Moving Average can also be obtained via “one-click” calculations. In our examples, you just need to right-click on the Order Quantity measure under the Data pane and select them from the dropdown:

14-One-click calculations

It’s not difficult to imagine how the AggregateCells operator could be used to derive more complex calculations to compare data elements from different offsets, such as the difference, the ratio, etc.

In the next article you will be exploring the AggregateTable operator.

Additional resources on advanced calculations and aggregations:

Documentation: Aggregated tabular operators
Community article: SAS Visual Analytics Advanced Calculations (part 2 of 4): AggregateTable by SAS' Renato Luppi
Community article: SAS Visual Analytics Advanced Calculations (part 3 of 4): Data Source Operations by SAS' Renato Luppi
Community article: SAS Visual Analytics Advanced Calculations (part 4 of 4): PMPM Example by SAS' Renato Luppi

sampathkumar · ‎09-03-2019

Hi

Please provide links to SAS Datasets explained in your Series for better understanding

Thanks and Regards

Sampath

touwen_k · ‎11-19-2021

very good article, the one thing worth mentioning is that aggregate cells changes depending on the sorting of data. What is the best practice: proc sort in SAS Studio and promote table, to have the desired results, or are there other options in VA to sort the data so that the results does not change? regards Karolina T.

Renato_sas · ‎11-19-2021

Exactly. AggregateCells perform a visual aggregation by referring to other cells, so their position in the visualization affect the results, which could or could not be important, depending on the calculation being performed.

Pre-sorting the table prior to loading it in memory will not help. Ideally we should prevent report consumers from changing the sort order, which cannot be done currently. I'll add a feature request for that. Thank you, @touwen_k !