BookmarkSubscribeRSS Feed

Using the maximum date for by groups in VA reports

Started ‎10-17-2019 by
Modified ‎10-17-2019 by
Views 7,954

Finding the latest or maximum date for use in the report as a title or filter is a common request that I addressed in an earlier article. In this post, I will cover how to use the Aggregated Data Source feature to display the maximum date for a by group. That is, when you have a category data item where you want to see the maximum date of activity per category item. Some example business use cases include:

  • Max build date of environments/source code grouped by project name
  • Max date of customer reviews grouped by location
  • Max date of team member activity grouped by manager
  • Max date of transaction activity grouped by transaction owner

Below is the example visualization we are aiming for. In this case, Project Type is our group by category and Manager and Days Active are considered extra information. This mock data could represent the number of days

  • an environment last pushed code or hotfixes
  • a server cluster has been running
  • a service track has been open
  • or even the last time a machine was serviced.

01_ResultObject.png

 

Here are the steps to find the Maximum Date with By Groups which will create the above visualization:

  1. Convert SAS date to a number and apply non-additive aggregation
  2. Create an aggregated data item to find the MAX of the date as number data item
  3. Create an aggregated data source including the MAX data item and the group by data items
  4. Optional: create a join with the aggregated data source and the original data source to include any additional columns required for reporting

Data Source and Business Question

In this post, I am using mock data that has a Star Wars theme. To reiterate, only mock data, no need to debate whether Darth Vader would in fact report to Princess Leia. Here is a screenshot of part of the data using the List Table object.

 

My business question:

”What is the most recent date of a Project Type, who is the owning Manager, and how long has that project been active?”

 

02_DataSource.png

Step 1: Convert SAS date to a number and apply non-additive aggregation

Similar to the technique used for finding the maximum date for a title or filter, the first step in this process will be to convert our SAS date to a number. Once it is a number, we must assign a non-additive aggregation to the data item.

 

From the Data pane, click New data item and select Calculated item....

 

03_NewCalculatedItem (1).png

 

From the Operators pane, use the search field to quickly find the TreatAs operator. Then select the Date data item from the Data Items pane and be sure to give a meaningful name the new data item. We can see that the result type returned is numeric. This will create a new column, Date as Number, and every row in the data will have the numeric representation for Date.

 

04_NewDateAsNumber.png

 

Next, from the Data pane, use the chevron arrows next to the data item we just created to change its aggregation to a non-additive option. Either maximum or minimum will work fine.

 

05_NonAdditiveAgg.png

 

Let’s check and see what this new data item looks like in the detailed data and aggregated data. It’s always a good idea to double check to make sure you didn’t forget to assign the non-additive aggregation property. We can see that for each Date value it has a matching Date as Number value and it is correctly not aggregating as we can confirm in the Crosstab object.

 

06_VerifyDateAsNumber.png

 

Step 2: Create an aggregated data item to find the MAX of the date as number data item

Next, we will create an aggregated data item to find the maximum date for a by group. In the previous step, we created a new data item for every row of data, the Date as Number data item. Now we will create an aggregated data item that will only be evaluated when assigned to an object and its value is dynamically driven by the categories assigned to the object. This can be referred to as an “on-the-fly” calculation.

 

From the Data pane, click on the New data item button again and select Calculated item....

 

07_NewCalculatedItem.png

 

From the Operators pane, use the search field to quickly find the Max aggregation operator. Then select the newly created Date as Number from the Data Items pane. You can see that I left the _ByGroup_ as the aggregation level. The other option is _ForAll_ but this would evaluate the maximum date for the entire data source and that’s not what I’m after. I want the maximum date for each by group, which is why we will use the _ByGroup_ aggregation level. Next, be sure to give a meaningful name to the new data item. Also, notice the result type is an aggregated measure.

 

08_MaxDate.png

 

Let’s check what this aggregated measure gives us, refer to the screenshot below. You might be thinking this looks pretty good! All we need to do is convert the numeric date back into a SAS date. And you would be right, that’s exactly what we need to do.

 

09_VerifyMaxDateByGroup.png

 

However, this is where we will run into the miss-matched aggregation level. Recall that the TreatAs operator wants to create a new data item for every row of data but the Max operator is an aggregated data item that gets evaluated “on-the-fly”. Thus we have a miss-match at the operand level. What do we do? Use an aggregated data source!

 

10_OperandMissMatch.png

Step 3: Create an aggregated data source including the MAX data item and the group by data items

Now we will create an aggregated data source but only include our maximum date aggregated measure and any category data items we want it to be grouped by. Be careful not to include extra columns yet as that will negate our intended by group maximum date aggregation and you’ll end up with your entire detailed data source again.

 

The idea is to only include the category data items you want your maximum date to be grouped by. Remember that this maximum operator is an “on-the-fly” calculation and will give us that dynamically driven by group maximum we want. If you need more of the detailed information, don’t worry, we can use the data join feature to add those columns back in which I cover in the next step.

 

From the Data pane, use the data source menu and select New aggregated data source....

 

11_NewAggregatedDataSource.png

 

From the new aggregated data source dialogue, select Max Date and Project Type and use the arrow to move them over to the Selected items window. You can see below, in the Preview pane the resulting rows that will make-up the aggregated data source.

 

This is where it’s a good time to recall our business question:

”What is the most recent date of a Project Type, who is the owning Manager, and how long has that project been active?”

 

12_CorrectAggDataSource.png

 

We don’t want to include the Manager column in the aggregated data source because that would result in the below rows. You can see that the “on-the-fly” Max Date aggregated measure now calculates the maximum date for the by groups Project Type and Manager.

 

This isn’t wrong, but this isn’t answering our business question: what is the most recent date of a Project Type, who is the owning Manager, and how long has that project been active?

 

The second part of our business question, “who is the owning Manager” is part of the extra detail that we can get after we join this aggregated data source back to our detail data source. What we need from the aggregated data source is the maximum date per Project Type.

 

13_WrongAggDataSource.png

 

Once you click OK, the Data pane should look like this:

 

14_AggDataSourceDataPane.png

 

Notice that Max Date is now grouped under the Measure data items and not under the Aggregated Measure. This is because our aggregated data source only has 5 rows and Max Date is not an “on-the-fly” calculation any more. Now we can use the TreatAs operator to convert the numeric date back into a SAS date.

 

From the Data pane, click the New data item button and select Calculated item....

 

15_NewCalculatedItem.png

 

This time select the _Date_ as the target result type in the TreatAs operator. Give your new calculated data item a meaningful name. Here, I selected the MMDDYY8 format.

 

16_NewMaxDateMDY.png

 

Let’s check out what this aggregated data source gives us. We can see that since I only included the Project Type category in the “Max Date _ByGroup_” List Table object that I get the same results as the “Aggregated Data Source” List Table object but the key difference is that in the aggregated data source we can convert the numeric Max Date into nicely formatted Max Date MDY.

 

17_VerifyMaxDateMDY.png

 

It may be the case that these three steps can meet your business question needs. You can easily use the aggregated data source to feed several report objects. Refer to my original post for all of the details about using an aggregated data source, but the key point to remember is that the aggregated data source is created for each user when the report is opened. This way any “on-the-fly” calculations are refreshed at run time.

Step 4: Optional: create a join with the aggregated data source and the original data source to include any additional columns required for reporting

The optional final step is to add back in the extra columns to answer the entire business question. Recall from earlier:

”What is the most recent date of a Project Type, who is the owning Manager, and how long has that project been active?”

 

There are a couple of ways to accomplish this task, each with their own pros and cons:

  1. Left join the detail data to the aggregated data source and use a filter on objects where date equals max date.
    • Pros – Max date is an additional column in the detail data and can be used for other calculations if desired.
    • Cons – You have to remember to apply the filter if you only want the maximum date rows.
  2. Create an aggregated data source of the aggregated data source and then left join to the detail data.
    • Pros – You have a dedicated and smaller data source with only the maximum date information with its additional columns for the "extra" detail.
    • Cons – You have multiple aggregated data sources that will execute when the report is opened.

I’m going to walk through the first option in the following steps: Left join the detail data to the aggregated data source and use a filter on the object where the date equals max date.

 

From the Data pane, make sure you have the detailed data source active. Then use the data source menu and select New data source join....

 

18_NewDataSourceJoin.png

 

Configure your New Data Source Join dialogue like the below screenshot. Notice that we can only join on the Project Type column. If you tried to add a condition for Date to Max Date this will not validate. That is because the Date is a date type and Max Date is a numeric type. Even though we created the Max Date MDY as a date type in our aggregated data source, it is not available for use in our join. This is where that second option I outlined earlier could be applied if you created an aggregate data source of your aggregate data source.

 

But for this solution, we only have one join condition for Project Type and I included all of the columns from both data sources.

 

19_DataSourceJoinConfig.png

 

After you click OK, here is what the Data pane will look like. Notice the two data items from our aggregated data source. We will need to convert the numeric Max Date to a date type again.

 

20_DataSourceJoinDataPane.png

 

From the Data pane, click the New data item button and select Calculated item....

 

21_NewCalculatedItem.png

 

This should be familiar now. Like before, give the new data item a meaningful name and select the desired date format, I selected the MMDDYY8 format. Be sure to use the _Date_ result type for the TreatAs operator.

 

22_NewMaxDateMDY.png

 

Now we can use the results of the data source join to assign Roles in our report objects. Remember, the key step for the solution we are implementing is that we will need to apply a Filter to the object where Date equals Max Date MDY. This will return the rows that match the maximum date per by group, Project Type in our case, which we found in our earlier steps.

 

23_FilterDateEqualsMaxDate.png

 

Since I have been displaying our results using List Tables up till now, I included both the List Table and a Crosstab to illustrate the final results. I like the Crosstab as it allows me to quickly verify the five Project Types. No visual is complete until you make sure you have fully addressed the business question:

“What is the most recent date of a Project Type, who is the owning Manager, and how long has that project been active?”

I am satisfied that these results answer my business question.

 

24_VerifySolution.png

Conclusion

Mastering this technique used to determine the maximum date in your data source for a by group requirement may become a frequently used solution in answering your business questions.

 

Maximum Date with By Groups Technique:

  1. Convert SAS date to a number and apply non-additive aggregation
  2. Create an aggregated data item to find the MAX of the date as number data item
  3. Create an aggregated data source including the MAX data item and the group by data items
  4. Optional: create a join with the aggregated data source and the original data source to include any additional columns required for reporting

However, the solution to find and display the maximum date for most business questions can be more easily satisfied using these articles:

If you are interested in additional use cases for using an aggregated data source or data source join, please refer to these posts:

Here are additional articles if you’d like to learn more about using dates in your VA reports:

Version history
Last update:
‎10-17-2019 01:03 PM
Updated by:
Contributors

SAS Innovate 2025: Register Now

Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!

Free course: Data Literacy Essentials

Data Literacy is for all, even absolute beginners. Jump on board with this free e-learning  and boost your career prospects.

Get Started

Article Tags