Some projects I work on, make use of Industry Models, like the IIW Insurance Information Warehouse by IBM. These are standard warehouses, with accompanying solutions for Solvency reporting etc.
Loading these Industry Models sometimes takes hundreds or thousands of mappings, transporting and transforming the data through the different layers (stages) and ultimately into the Warehouse.
Some of my colleagues (working on different ETL tools like Informatica Datastage) have come with a solution facilitating this process by Template Based ETL Design.
The idea is that a program can be developed that reads a template (XML, XLS or other format), and processes the data into mapping files that can be read by the ETL tool. A first successful approach was by using Python (OO programming language), reading an XLS file containing source columns, target columns and simple transformational info. The Python code uses a custom built library where the ETL functions of Informatica are replicated in Python code. The library functions can then be used in the Python program.
This works for Informatica Datastage, because Datastage can read/import the XML mapping files created by Python. Downside is that the XML is version dependent and needs to be changed whenever there is a change in the Datastage version.
But it works.
I would like to be able to do a similar exercise for SAS, but am not sure if is it feasible, because SAS is so very metadata driven and has limited import functionality.
Maybe SAS Macros are a possibility, but I have 0% experience in SAS Macros.
My question to you on this forum is, if Template Based ETL Design is at all possible in SAS and if so, what is the best approach?
We're considering a product feature that would work in this way. We would allow you to specify template jobs for particular table types (dimension, fact, bridge, etc), specify a data model and then we would generate jobs and a process flow that you could customize. Does this sound like what you're after? This is future rather than current product, but I'm interested to hear if this would meet you needs.
I think it would be difficult to do this with the current product. Jobs are modeled at a very low level of detail as metadata objects and it would take deep internal knowledge of how the job is constructed to accomplish this. That said, I'll check to see if we have any utilities that would meet your needs.
This is a very interesting idea. We are using XML as a transmission medium for specifications for a new generalized tabulation product we are developing, so it's possible. I would be quite confident taking something similar on for the DIS environment, but I would expect it to be a LOT of work; more in the category of significant system development project than something that could be lashed together quickly. On the other hand, once you built it, you could always sell it!
One could say that SAS has built such ETL templates for their industry/cross-industry solutions. Probably by "hand".
One idea is to do the same for non SAS solutions, such as the already mentioned IIW. Once done for an initial project, the jobs can be reused at other sites, as long as the necessary terms are set. The data models in respective soultion could quite easily be imported using meta bridges.
I might be wrong, but I believe Insfocus implemented something like that in their insurance data warehouse. I was working there then and they wanted to implemented some SAS solutions. I don't know what become of that.