Greetings all,
I have an Information Map with one fact table (~1MM rows) and 8 dimension tables attached. We are pulling data from a SPD Server "data warehouse". When I utilize the information map to develop reports in Web Report Studio, I have a speed problem when I create a report that utilizes a category fieldf rom more than one dimension table. For example:
Fact Table: "Transaction"
Dim Table: "Product"
Dim Table: "Date"
Let's say there is a field in "Transaction" called Transaction $, and I am trying to sum that by Product and Date in a WRS report.
(1) Create a report with Date and Transaction $. The report generates in ~5 seconds.
(2) Create a report with Product and Transaction $. The report generates in ~5 seconds.
(3) Create a report with Date, Product and Transaction $. The report generates in ~4.5 minutes.
This example is simplified; the delay when adding an element from a second dimension table always causes the same time delay. Here are several things we've done to test the query:
Test 1: Using the "Test" facility in Information Map Studio, we see the exact same delay in returning results - adding a category variable from one dim table gives us fast response, while adding categories from two dim tables gives us slow response.
Base SAS: Took the SQL query code that was generated from Information Map Studio ("View SQL") and ran the same query in Base SAS. Same response times and issues. Then, we made some changes to the code to see if we could get a better response from the SPD database if we changed the query. Changed indexing, added the Starjoin option and still got same response. Finally, when we manually added subqueries to the SQL code the query responded a short period of time.
It seems that the query generated by the Information Map is not performing well in this Star Schema; this is troubling, because it is a basic underpinning of BI systems using Star Schema. This report from SAS Global Forum (http://support.sas.com/resources/papers/proceedings10/304-2010.pdf) indicates that the SQL Planner should be able to optimize the results of the query. That does not seem to be happening.
Any thoughts from users on steps to take to improve this performance?