I need to build a new repository to store data from oracle (our core system).
We target to do some analysis on the sales performance / segmentation study (regarding their sales habit on product, customer types, sales' demographic factors ...)
I assumed we will use the typical star schema which the dimension tables (products, customer, sales) point to the fact table we want to analysis (orders).
The total no of the orders are about 800000. And when it break down to details e.g. product level it includes about 4000000 records. In other words, each order contains about 5 products in average. And the order records will contains some detail info e.g. sales date, order date ...
Besides, each order details (in product level) will generate different sales amount per month
e.g order 1 has 5 products
in 2011 Jan
the record will be like
order_id product_id amount
1 1 10
1 2 5
1 3 8
1 4 10
1 5 5
in 2011 Feb
the record will be like
order_id product_id amount
1 1 8
1 2 5
1 3 0
1 4 0
1 5 12
and so on.
the first question is that it's reasonable to treat this sales table as the fact table? and whether it's correct to put it's details (not the meansures e.g. sale amount) to it (which most of these data are static and not change frequently) or treat it as dimension table?
the second question is how should I handle these dimension table (which not update frequency). should I add product month on each table and keep every set of data as if we need to analysis the historical data? Or we can keep the current copy and only store the data by production month in the OLAP cube levels?
More, for the fact table, should we keep it by production month too, the data volume seems not small to me i.e. 4000000 x 12 for one year.
More, I heard that SAS support the use of change date capture with oracle. Is it a way to handle the aforesaid questions?
Or, I should just use the materialized views for the dimension tables?
Since I am totally new to SAS and data warehouse, really hope anyone experienced can give advice to you.
Thx in advance.
... View more