Welcome back to the second part of using advanced reporting features in your Visual Analytics 8.3 environment. The previous article, VA 8.3 Aggregated Data Source, describes how to use aggregated data sources to work around the nested aggregated measure limitation. This article will look at the ability to create data joins from your VA report.
A few key points up front about report data joins:
When would you use a VA 8.3 report data join?
When would you NOT use a VA 8.3 report data join?
Let’s take a look at how the data join can enhance your report by looking at how your data source can be augmented. Say this represents your main source data:
You can, for simplicity sake, add columns to your data source but not rows. This is not an append functionality, but a way to join two tables together based on join criteria.
In this example, here is my target visualization. I want to be able to select the Origin and Make of the cars to further inspect the MSRP by the number of Cylinders.
But how can I do this, since my data is stored in two separate tables? You can see that the Cylinders column is in the CARS_SPECS table and the MSRP column is in the CARS_PRICING table. The answer is to join the tables.
You do not have to have the "source" tables you wish you join added to your report. You could create a data source join and have that be your only data source to the report, but as you are designing your report it will be common to have your “source” tables as data sources for the report too.
In this example, I already have both the CARS_SPECS and the CARS_PRICING added to the report. Next, from the Data Pane, use the Data Source Menu and select New data source join.... You could select the option Join data to CARS_SPECS but my screen shots will follow the first option.
Next, use the dialogue box to select the second data source, specify the join type and, if you want, rename your resulting join table. I'm using the default name.
Next, specify your join conditions:
And lastly, choose the columns you wish to be in the result table. It is helpful that you can see the source table of the column in this window. Use the arrows between the available and selected windows to move columns. You do not need to keep the duplicate columns used in the join criteria.
The final screen should look something like this:
You will now see the resultant data join table available as a report data source. You can use this to source any of your report objects.
You’ll want to make sure your join is producing the expected results. To verify, I find it easiest to use the List Table object and add an object for each data source. Then I use page prompts, mapping data if necessary, to narrow down the rows for comparison to make sure the row count and values are accurate.
In this example, I will combine two features introduced in the VA 8.3 release. First, I will create an aggregated data source and then join that aggregated data source to the original data source. Why? Here is the sample business case:
I have several Product Lines and their associated Order Product Cost, that is direct manufacturing costs associated with that transaction, at the year, month, etc. level that I want to compare to the "overall" Average Order Product Cost.
These are the kinds of visualizations I am after. You can see that in the Target Bar Chart object I am using the Average Order Product Cost as the target measure that will dynamically change depending on the selected Product Line. And in the Time Series Plot I am using the Average Order Product Cost as a reference line.
I will walk through the click steps in this article, but if you are interested in more details about creating and using an aggregated data source, please see my VA 8.3 Aggregated Data Source article.
First, in the original data source, I must select the type of aggregation I want used for my measure in my aggregated data source. I want to analyze the average Order Product Cost, so I will select the aggregation type Average. Use the double chevron arrows to expand the data properties of the data item to edit the aggregation type.
Next, from the Data Pane and with your original data source selected, use the Data Source Menu and select New aggregated data source....
I selected two data items: Product Line and Order Product Cost to be included in my aggregated data source. You can see from the Preview section below how the average Order Product Cost will be stored for each Product Line value.
With the new aggregated data source active, I renamed the data item from Order Product Cost to Order Product Cost (Total Avg) so that I do not have duplicate data items in the Join result.
Next, I join the original data source to the aggregated data source. In an overly simplistic explanation, this is what the join will do for every row of the data. A new column will be added and for each row where the Product Lines match, the value will be written in that row.
From the Data Pane, with the original data source selected, use the Data Source Menu and select New data source join....
Use the Data Source Join window to define your new data source. Select the aggregated data source as the second join. Make sure the Join type is set appropriately. Notice my join condition is only where
Product Line = Product Line. Remember, for each row I want to copy over the aggregated measure for comparison. As for my selected columns, I brought over everything from the original data source and then my new column from the aggregated data source.
Now my join result is available as a data source to my VA report:
The advantage of this approach is that since this aggregated column is available for every row in my data, I can use any level of date format I wish without any extra work. The previous screen shots I had was comparing the years. But if I make Year a page prompt, I could look at the month detail such as in this example:
This is an exciting new feature and I know that being able to join data inside a VA report will help meet many customers' reporting requirements.
Registration is open! SAS is returning to Vegas for an AI and analytics experience like no other! Whether you're an executive, manager, end user or SAS partner, SAS Innovate is designed for everyone on your team. Register for just $495 by 12/31/2023.
If you are interested in speaking, there is still time to submit a session idea. More details are posted on the website.
Data Literacy is for all, even absolute beginners. Jump on board with this free e-learning and boost your career prospects.