For many, the term data warehouse denotes a data retrieval system that supports fast, efficient access through a well-known and defined schema, such as a star schema. A typical star schema consists of a group of dimension tables that have primary/foreign key associations to a central fact table. These storage paradigms which originated in the 90’s were very adept at providing quick, easy access for business reports. The investments, both the hardware and software, to prototype, develop and build these systems is significant. The continued investment to maintain and update these systems will continue to be significant.
Unfortunately, in today’s business world, the use of analytics to drive real-time decisions cannot be fully supported by traditional data warehouses, because they lack the full breadth of data required. With the volume, velocity, and variety of BIG DATA, organizations need to leverage data from a variety of data sources, not just the data warehouse. The time and expense involved using extract, transform, and loaded (ETL) processes for updating the data warehouse, means that timely business decisions based on data are no longer possible. As business markets and direction change, they typically outpace the ability of a data warehouse to evolve to support those new directions and initiatives.
For that reason, data virtualization using the SAS Federation Server will allow IT and business users to leverage disparate sources of data very quickly for driving analytics and business reporting. The ability to combine data warehouse data with social media, legacy data, and documents in a federated view, which can then be leverage by analytics or reporting tools, is critical for keeping pace with business requirements. It provides users with complete business insight to support revenue initiatives and mitigate cost and risks.
As organizations update traditional data warehouses or generate new data marts, specific to business needs, the use of data virtualization to quickly interrogate the data for completeness of use becomes very important. You would not want to update a data warehouse star schema model with new data entities, only to discover during the latter analytic and reporting phase that the data is lacking or incorrect.
Consider the following picture on how data virtualization can complement your data warehouse and then ask yourself these questions. If you can answer YES to just one, then data virtualization is worth investigating.
Is data virtualization a replacement for your data warehouse? No, absolutely not. It is a complement to the processes and procedures that your organization has used for decades as part of a warehousing methodology. Data virtualization can, however, be used to help revitalize your data warehouse by extending its usefulness and allow you to continue to leverage years of investment in the platform and technology of the data warehouse.
Check out my previous articles about accessing and reporting on disparate data sources with data virtualization and an all-too-uncommon exchange between a data and IT analyst about data needs.
And, follow the community for another article in this series focusing on real-time access and security.