06-28-2013 11:54 AM
We are evaluating DI Studio (v 4.4) capabilities. Specifically, the Import SAS Code process.
Client would like to investigate the possibility of importing BASE code into DI to automatically generate a job and create metadata. We are interested in the following:
I understand that this may be a lot to ask for from the tool. Any suggestions/observations are welcomed.
06-28-2013 04:47 PM
I haven't tried it, but looks to me like #1 should be possible. A DI job can invoke user-written SAS code. As I read the docs, if you have a metadata library pointing to the DBMS, you can run PROC METALIB to register all the tables in the database.
06-29-2013 11:13 PM
Using DIS4.6 I've done some testing importing DIS generated SAS code (after changing the name of the target table). What I found:
- I've got working DI jobs for most of the .sas code
- Libraries used need to be pre-registered
- The new target table got automatically created in metadata as permanent table (based on the column definitions in the code).
- I had issues with a few programs where I'm re-defining libraries dynamically within the code for context specific execution.
- The DI jobs generated worked but the flows consisted of nodes called "Datastep" and the flow was not very desciptive on showing what's going on in the job.
- External tables registered in metadata didn't get added to the flows (eventually because the path definitions for these external files were SAS macro variables). The import step got added as code so the resulting DI job as such was working correctly.
- Some column mapping had been added to nodes but it wasn't a clean end-to-end mapping (also because external tables hadn't been added).
- "importability" of .sas files and quality of resulting flows will strongly depend on structure and code quality of source .sas files.
- assign libraries in metadata using librefs as in code.
- register tables before importing the SAS programs (so that table metadata gets created with all attributes and not only what has been defined in code).
- consider re-working the DI jobs created to get more descriptive (= better maintainable) flows and to add full end-to-end mapping from source to target (=allowing for column level impact analysis).
07-02-2013 05:06 AM
In addition I would like to pint a current defect, the import will not identify if a source table has PK/FK on them, leading to that this metadat will be rewritten, and probably leading to a Java error during the import.
Another thing is to make sure that intermediate work tables should have individual names, reusing a table name will make DIS confused, or at least could make the flow look weird.
Macro looping can work, but if you have call execute functionality, this will probably have an unwanted layout in the DIS flow.
07-02-2013 02:30 PM
Thanks for the input, greatly appreciated.
We found limitations even when using DI generated code. When importing back into DI, it would not produce the same results.
Libraries and input tables were pre-registered. DI did not register/recognize input DB2 tables only register output SAS tables.
Obviously this affected mappings and subsequently lineage capabilities.
Patrick, point taken about "quality" of source code. I feel that the sample code I'm working with is clearly structured and well written. No complex macros or looping involved.