BookmarkSubscribeRSS Feed
lenats
Calcite | Level 5

HI,

I have a huge cube (10 years of monthly data) that updates every month with the UPDATE_INPLACE option.

The cube is built on a view, that takes more place than the temporary library on the server can hold (when select * is executed )

and it can't be reloaded at once.

Let's say you have a change in the underlying data and have to reload one month. How do you do it ?

I can't (easily) rerun the cube. Additional UPDATE_INPLACE run doubles the data 

 My idea would be to delete a month of data and reload it, nut I can't find the right option... any ideas? 

(SAS 9.3, DI 4.6, Proc OLAP ) 

11 REPLIES 11
Shmuel
Garnet | Level 18

Have you tried run a simple copy of the view to a sas dataset,

either using proc copy or just:

   data new;

    set have;

   run;

 

what issues do you have to run such a code?

lenats
Calcite | Level 5

The view contains a cartesian join tracking the customers' movement through the categories. one "slice"  = 500 mb, and we have 10 years of data . I don't want to use that much disk space

we used this design once, but moved to oracle view + bitmap indices instead.

Shmuel
Garnet | Level 18

Something is not clear to me:

 

You said "Additional UPDATE_INPLACE run doubles the data".

 

You have a view of 10 years. Your update is of one month only.

 

Then what do you mean that it doubles the data ?

and if update is INPLACE why should it double the data ?

lenats
Calcite | Level 5
The cube is updated from a temporary table. The whole view is only used for
the drill through and rerun (initialize) purposes
lenats
Calcite | Level 5

Let's say i've already loaded my september data, and the cube shows that i have total 5 million customers

If I load it again with update_inplace  the cube shows me 10 million customers

 

The oracle tables underlying the view are partitioned, therefore I can easily load one month.

They have bitmap indices too, so that the drill through works really fast (the hierarchies correspond to the indices)

what I can't do is to perform a select * from the view without any where statements, the oracle's temp space can't hold it, and the SAS work space  can't hold it either.

AhmedAl_Attar
Rhodochrosite | Level 12

Hi,

 

There are few factors about OLAP cubes you'll have to consider here, specially when it comes to updating historical data elements stored in the cube.

1. Hierarchies involving your Time Dimension elements/variables

2. Stored Aggregations that directly/explicitly contains your monthly data (Month level/Quarter level/Yearly level)

3. Stored Aggregations that indirectly/implicitly contains your monthly data (Year over Year differences, Annual Total by other dimension)

4. Custom Calculated Members / MDX formulas

5. Custom Sets

6. ?

 

When you start looking at how your monthly data is consumed inside your Cube, you'll realize, sometimes the best way to ensure your data updates get propagated correctly, is to rebuild the entire cube!

 

If you had the OLAP Cube built/constructed using per-aggregated tables (ROLAP Aggregation Storage), then you can go ahead, and update each and every involving aggregation table in order to update the OLAP Reports & results.

 

BTW, this is how you can use Proc SQL to access a SAS OLAP Cube,

proc sql;
connect to olap (host=localhost port=5451 user=user pass=pass);
         select * from connection to olap (
              "select [A].[All A].[M].[Adam's] on Rows,
                 crossjoin([Measures].defaultMember, [B].defaultMember)
                 on Columns
                   from [QuoteTest]"
       );
disconnect from olap;
quit;

Just my 2 cents,

Hope it helps,

Ahmed

lenats
Calcite | Level 5

The cube is not that complicated. Nway aggregation, shallow time dim (only months) the most complicated hierarchy is 4 level internal hierarchy tree, simple calculated members (ratios etc) .

 

Most of the work is done by the ETL beforehand, creating the "slices" ( each slice contains a customer,base month, new month, the data contains his movement between hierarchies' members , i.e. John was premium in 03/2014, nowdays not-so-premium.

 

The cube shows movement totals, i.e. comparing mar 2014 to 2017 10 clients moved to premium, 15 clients moved from premium to ordinary, 10 clients are new, 20 left, 150 stayed in premium,1000 stayed in ordinary class

AhmedAl_Attar
Rhodochrosite | Level 12

Hi,

I think it's fare to say, the Update feature of the SAS OLAP Cubes are Incremental Updates, rather than Historical Updates!

- What’s New in SAS ® OLAP Cube Studio 4.2

 

Therefore I would encourage you to review the following links to get an idea of your alternative approaches and design decisions

 

- SAS ® OLAP Cube Tuning and Query Performance Optimization

- Manipulating OLAP Cubes: Advanced Techniques for SAS ® Programmers

- Escape from Big Data Restrictions by Leveraging Advanced OLAP Cube Techniques

- Using SAS ® OLAP Server for a ROLAP Scenario

 

- Utilizing Oracle Materialized Views as underlying aggregates

Basic Materialized Views

CREATE MATERIALIZED VIEW

OLAP Expression Syntax Reference -- OLAP Functions

 

In Summary, I think you would be better served by utilizing Oracle Materialized Views & Aggregates, to handle underlying Historical Data Updates and Drill-Through queries. And for Client movement totals you can either per-calculate them in the your Materialized Views, using Oracle's OLAP Functions or using MDX within your Proc OLAP Statements to define Calculated Members.

 

In theory, this design would allow you to:

- Handle all future Historical data updates automatically by Oracle - Faster availability for Reporting

- Handle all future Incremental data Updates automatically by Oracle - Faster availability for Reporting

- Allow for other tools to utilize the Oracle Materialized Views

- Reduce data duplication, as the data would only reside in Oracle, rather than storing additional Aggregates in SAS OLAP Cube(s)

- Reduce/Eliminate some of the existing ETL jobs and transformations

 

Thanks,

Ahmed 

 

lenats
Calcite | Level 5

Let me see if I understand you

the idea is to build the oracle view for the drill through purposes , and to add a materialized view that sums the data and to build the cube on the materialized view without the nway aggregation ? 

 

AhmedAl_Attar
Rhodochrosite | Level 12

Yes, Pretty much.

You can define your ROLAP aggregates in the Cube as your underlying Oracle Materialized Views. Depending on your Dimensions, Hierarchies and Drill-paths, you may be able to create and store SAS MOLAP aggregates in your Cube for data that doesn't intersect with Volatile/Changing data over time.

 

 

BTW, Keep in mind, ROLAP aggregates does not have to be Database Tables/Views! They can be based on pr-aggregated SAS data sets. So if you have an ETL process that would extract data from Oracle and rebuild/update your underlying SAS Aggregates via Proc Summary/Means. You may not get the same performance speed if you don't store these per-aggregates in SPDE format, like how the OLAP Cube does behind the scenes.

 

You multiple options to consider and evaluate, an the end you have to choose what makes most sense to you and your organization.

 

Ahmed

lenats
Calcite | Level 5
Thank you for the ideas! I'll try it.

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 11 replies
  • 1017 views
  • 1 like
  • 3 in conversation