I am looking for guidance on how to identify the root cause of a data refresh issue when using the SAS Add-In for Excel to open a data set directly into a pivot table. Specifics of the problem and steps that I have already taken are:
• Data set contains 90,000 observations and 47 fields.
• When I create a simple pivot table, I encounter a refresh wait of roughly 17 seconds.
• When I create a complex pivot table, I encounter a refresh wait of roughly 5 plus minutes. This report consists of three column labels, four row labels, and two values (one sum and one average).
• When I open the data set into a worksheet and then create the complex pivot table using a named range I encounter no performance issues. The 90,000 records take roughly 33 seconds to completely refresh and there is no noticeable drag in pivot table performance.
• Last item, we are running on workstations/high end laptops with 4 GB of ram.
Does anyone have any ideas why I might be seeing a data refresh and update period of 5 plus minutes on the complex pivot table? Is this a configuration issue, limitation of what can be done with the add-in, etc. Hope this does not sound like a snipe hunt.
When you are using the Enterprise Intelligence Platform, your configuration setup and usage will be different from almost every other site's installation. To work with someone to debug whether it is your particular configuration or some other issue, you should open a track with Tech Support: http://support.sas.com/ctx/supportform/createForm
Excel has to pull all the data to the client to perform the aggregations on the tabular data and then display the results in the PivotTable. So, the most likely explanation of the delay is the time it takes to get the data to the client. That would also explain why it is much faster if you first open the data into a worksheet. It also takes Excel some time to perform the aggregations -- obviously longer for more data.
For better PivotTable performance on larger data, we typically recommend using OLAP cubes/server, so processing takes place on the server instead of the client.
ps- You can test Excel's PivotTable performance on your data outside of the SAS Add-in by manually creating a PivotTable in Excel using an external data source and the appropriate data provider (SAS IOM provider for server data, SAS Local Data provider for local data, SAS OLAP provider for OLAP cubes). It should be equivalent to the performance you experience in the SAS Addin.
No. Any filters and subsetting are applied on the server. Then, Excel pulls only the filtered/subsetted data. So, filtering and subsetting is a good way to increase PivotTable performance. Thanks for asking.
Wanted to thank everyone for their input and to provide some updated information/observations based upon some experimentation that I performed over the weekend.
1. The pivot table was supplied to me by one of the users that I support.
2. I rebuilt the pivot table field by field and forced a data refresh with each field addition to measure relative performance.
3. The performance issue in this case appears when a text field is added as a pivot table column label and other column labels already exist (year and quarter).
4. The text field contains roughly 51 unique values.
5. The bulk of the problem appears to be one of education as it is difficult for an end user to identify when the data has been extracted from SAS via the SAS Add-In for Excel and when Excel functionality kicks in. The user in question attributed to being a SAS issue and it was not. Overall, more than three fourths of the performance drag appears to be Excel having to generate an extremely large number of cell calculations.
6. I do want to state that when I dropped the data from SAS directly into a worksheet and built a pivot table off of the worksheet data, that the performance issue still existed but not to the extent that I encountered when I dropped the data from SAS directly into a pivot table.
7. This goes to show that a user really needs to know their data and what their asking of it when building a pivot table. I am definitely curious to see what type of performance gains could be made using an olap solution and keeping the data and calculations server side.