I have an Information Map with one fact table (~1MM rows) and 8 dimension tables attached. We are pulling data from a SPD Server "data warehouse". When I utilize the information map to develop reports in Web Report Studio, I have a speed problem when I create a report that utilizes a category fieldf rom more than one dimension table. For example:
Fact Table: "Transaction"
Dim Table: "Product"
Dim Table: "Date"
Let's say there is a field in "Transaction" called Transaction $, and I am trying to sum that by Product and Date in a WRS report.
(1) Create a report with Date and Transaction $. The report generates in ~5 seconds.
(2) Create a report with Product and Transaction $. The report generates in ~5 seconds.
(3) Create a report with Date, Product and Transaction $. The report generates in ~4.5 minutes.
This example is simplified; the delay when adding an element from a second dimension table always causes the same time delay. Here are several things we've done to test the query:
Test 1: Using the "Test" facility in Information Map Studio, we see the exact same delay in returning results - adding a category variable from one dim table gives us fast response, while adding categories from two dim tables gives us slow response.
Base SAS: Took the SQL query code that was generated from Information Map Studio ("View SQL") and ran the same query in Base SAS. Same response times and issues. Then, we made some changes to the code to see if we could get a better response from the SPD database if we changed the query. Changed indexing, added the Starjoin option and still got same response. Finally, when we manually added subqueries to the SQL code the query responded a short period of time.
It seems that the query generated by the Information Map is not performing well in this Star Schema; this is troubling, because it is a basic underpinning of BI systems using Star Schema. This report from SAS Global Forum (http://support.sas.com/resources/papers/proceedings10/304-2010.pdf) indicates that the SQL Planner should be able to optimize the results of the query. That does not seem to be happening.
Any thoughts from users on steps to take to improve this performance?
It covers the undocumented PROC SQL option _METHOD that shows you what the SAS SQL optimiser has done to your query. Use it on both your original and improved SQL queries and see what SAS is doing differently.
I think this is a unique issue, in that I'm trying to utilize the base functionality of Information Map Studio, and I don't know how to alter how IMS creates those joins or how the SQL Planner makes decisions. Are there any papers or books that relate to optimizing the SQL Planner in SPDS, or altering the way Info Map Studio creates joins?
If you can't define the Information Map in a way that it generates the SQL code you need then eventually you could create a SQL view containing the optimised SQL code and an Info Map over this view.
Reading the paper in that link I gave is a good start to understanding how the SQL optimiser works. I hate to say this but there are no magic bullets when it comes to performance tuning, and especially so with SQL.
One thing to look for when using _METHOD is to see if SAS is doing table scans on joins instead of indexed joins. Table scans are way slower. It could be you are missing an index, resulting in a slow join. Only careful checking will uncover the true reasons.