Dealing with wrong aggregation due to text field

Chris11115 · Posted 02-12-2022 08:31 AM

Hello community,

I have a problem with using a text field in an aggregated data source:

I need customer tables, where you can see the change in exposure from date1 (current date) to date2 (date of comparison). The underlying data is on an account level. Through parameters you can dynamically choose the two dates that should be compared and you can filter account classes. An example of an output table would be something like this:

Customer Number (usually @date1, except for tables where I look at customers that are no longer part of the data source in @date1, then @date2)
Customer Name (analogous to Customer Number)
Customer rating @date1
Customer rating @date2
Exposure @date1
Exposure @date2
Change in exposure

This far, I accomplished this task by creating an aggregated customer data source for date1, one for date2 and then creating a final data source using an Outer Join on the Customer Number. This does exactly what I need. However, now we also have to do this part of the report for our largest business scope, including all connected firms and this two-step approach just takes too long, over three minutes to see results. This report does not really rely on speed, but it should be under one minute.

I read up on design recommendations and saw that Joins in a report are not a good choice. Preparing the data beforehand is not an option for me since I would lose the ability to dynamically change compared dates and/or filter the underlying accounts. Therefore I tried to do it all within one aggregated data source.

This was easy for exposures (create two data fields in the original data source for Exposure @date1 and Exposure @date2 using parameters) and customer ratings (similar, by using a Max on nominal representation and then remapping it to categories in the aggregated data source). However, for the life of me I cannot figure out what to do with the customer name.

Problem is that customer names can change for firms over the course of time. When I do nothing and just add the data field to the aggregated data source, I get two rows for customers where the name changed.

In T-SQL I would use something like a MAX function on text data fields like this:

MAX(case when date=@date1 then [Customer Name] ELSE NULL END) as [Customer Name @date1]
MAX(case when date=@date2 then [Customer Name] ELSE NULL END) as [Customer Name @date2]

So I tried to do something similar and create a customer name @date1 and @date2 to deal with this, however, the aggregated data source groups by its NULL values for text fields, I get two rows and calculations for these customers are therefore not correct.

Does anyone know how to deal with such text fields that should not be part of the group by function in an aggregated data source? Or is there possibly a way to significantly improve performance on the original Outer Join solution?

Thank you in advance!

Chris11115 · Posted 03-04-2022 03:55 PM

Any ideas?

Simply not possible? Description too long? 😉

Dealing with wrong aggregation due to text field

Re: Dealing with wrong aggregation due to text field

Catch up on SAS Innovate 2026