11-20-2017 08:53 AM
I have a huge cube (10 years of monthly data) that updates every month with the UPDATE_INPLACE option.
The cube is built on a view, that takes more place than the temporary library on the server can hold (when select * is executed )
and it can't be reloaded at once.
Let's say you have a change in the underlying data and have to reload one month. How do you do it ?
I can't (easily) rerun the cube. Additional UPDATE_INPLACE run doubles the data
My idea would be to delete a month of data and reload it, nut I can't find the right option... any ideas?
(SAS 9.3, DI 4.6, Proc OLAP )
11-20-2017 09:02 AM
Have you tried run a simple copy of the view to a sas dataset,
either using proc copy or just:
what issues do you have to run such a code?
11-20-2017 09:06 AM
The view contains a cartesian join tracking the customers' movement through the categories. one "slice" = 500 mb, and we have 10 years of data . I don't want to use that much disk space
we used this design once, but moved to oracle view + bitmap indices instead.
11-20-2017 09:15 AM
Something is not clear to me:
You said "Additional UPDATE_INPLACE run doubles the data".
You have a view of 10 years. Your update is of one month only.
Then what do you mean that it doubles the data ?
and if update is INPLACE why should it double the data ?
11-21-2017 02:17 AM
Let's say i've already loaded my september data, and the cube shows that i have total 5 million customers
If I load it again with update_inplace the cube shows me 10 million customers
The oracle tables underlying the view are partitioned, therefore I can easily load one month.
They have bitmap indices too, so that the drill through works really fast (the hierarchies correspond to the indices)
what I can't do is to perform a select * from the view without any where statements, the oracle's temp space can't hold it, and the SAS work space can't hold it either.
11-20-2017 10:14 AM
There are few factors about OLAP cubes you'll have to consider here, specially when it comes to updating historical data elements stored in the cube.
1. Hierarchies involving your Time Dimension elements/variables
2. Stored Aggregations that directly/explicitly contains your monthly data (Month level/Quarter level/Yearly level)
3. Stored Aggregations that indirectly/implicitly contains your monthly data (Year over Year differences, Annual Total by other dimension)
4. Custom Calculated Members / MDX formulas
5. Custom Sets
When you start looking at how your monthly data is consumed inside your Cube, you'll realize, sometimes the best way to ensure your data updates get propagated correctly, is to rebuild the entire cube!
If you had the OLAP Cube built/constructed using per-aggregated tables (ROLAP Aggregation Storage), then you can go ahead, and update each and every involving aggregation table in order to update the OLAP Reports & results.
BTW, this is how you can use Proc SQL to access a SAS OLAP Cube,
proc sql; connect to olap (host=localhost port=5451 user=user pass=pass); select * from connection to olap ( "select [A].[All A].[M].[Adam's] on Rows, crossjoin([Measures].defaultMember, [B].defaultMember) on Columns from [QuoteTest]" ); disconnect from olap; quit;
Just my 2 cents,
Hope it helps,
11-21-2017 02:29 AM - edited 11-21-2017 02:31 AM
The cube is not that complicated. Nway aggregation, shallow time dim (only months) the most complicated hierarchy is 4 level internal hierarchy tree, simple calculated members (ratios etc) .
Most of the work is done by the ETL beforehand, creating the "slices" ( each slice contains a customer,base month, new month, the data contains his movement between hierarchies' members , i.e. John was premium in 03/2014, nowdays not-so-premium.
The cube shows movement totals, i.e. comparing mar 2014 to 2017 10 clients moved to premium, 15 clients moved from premium to ordinary, 10 clients are new, 20 left, 150 stayed in premium,1000 stayed in ordinary class
11-22-2017 09:38 AM
I think it's fare to say, the Update feature of the SAS OLAP Cubes are Incremental Updates, rather than Historical Updates!
Therefore I would encourage you to review the following links to get an idea of your alternative approaches and design decisions
- Utilizing Oracle Materialized Views as underlying aggregates
In Summary, I think you would be better served by utilizing Oracle Materialized Views & Aggregates, to handle underlying Historical Data Updates and Drill-Through queries. And for Client movement totals you can either per-calculate them in the your Materialized Views, using Oracle's OLAP Functions or using MDX within your Proc OLAP Statements to define Calculated Members.
In theory, this design would allow you to:
- Handle all future Historical data updates automatically by Oracle - Faster availability for Reporting
- Handle all future Incremental data Updates automatically by Oracle - Faster availability for Reporting
- Allow for other tools to utilize the Oracle Materialized Views
- Reduce data duplication, as the data would only reside in Oracle, rather than storing additional Aggregates in SAS OLAP Cube(s)
- Reduce/Eliminate some of the existing ETL jobs and transformations
11-23-2017 03:21 AM
Let me see if I understand you
the idea is to build the oracle view for the drill through purposes , and to add a materialized view that sums the data and to build the cube on the materialized view without the nway aggregation ?
11-23-2017 10:44 AM
Yes, Pretty much.
You can define your ROLAP aggregates in the Cube as your underlying Oracle Materialized Views. Depending on your Dimensions, Hierarchies and Drill-paths, you may be able to create and store SAS MOLAP aggregates in your Cube for data that doesn't intersect with Volatile/Changing data over time.
BTW, Keep in mind, ROLAP aggregates does not have to be Database Tables/Views! They can be based on pr-aggregated SAS data sets. So if you have an ETL process that would extract data from Oracle and rebuild/update your underlying SAS Aggregates via Proc Summary/Means. You may not get the same performance speed if you don't store these per-aggregates in SPDE format, like how the OLAP Cube does behind the scenes.
You multiple options to consider and evaluate, an the end you have to choose what makes most sense to you and your organization.