Desktop productivity for business analysts and programmers

DI vs EG

Accepted Solution Solved
Reply
N/A
Posts: 1
Accepted Solution

DI vs EG

Hi,

Can someone please tell what is the difference as well as advantage and disadvantage between DI and EG in term of doing ETL? I see EG can do ETL also, but I really wonder the advantage using DI to EG in this matter. Thanks a bunch!!

Accepted Solutions
Solution
‎09-05-2017 11:10 AM
New Contributor
Posts: 3

Re: DI vs EG

[ Edited ]

For ETL activities, especially those on a larger scale, I prefer the rigour that DIS brings. It's a complete metadata-driven solution.

On a practical level, I have found the following differences when using EG4.1 for ETL tasks:
- EG does not have a loop construct
- EG does not have pre-built auditing, pre and post processing, data validation, lookup, and data loader transformations.
- Creating and using indexes in DIS is a couple of mouse clicks......more difficult in EG.
- Deploying code is more cumbersome in EG. For each job, an EG project must first be run, exported, and then deployed in SAS management console (if using Platform scheduling). DIS allows multiple jobs to be deployed with a single 'click'.
- Modifying EG projects can be a little tricky. An EG project can't be edited until it has first been run in the current session (except for custom code). If custom code is being edited, the project must be re-run to ensure the correct code is exported. An easy step to forget.
- DIS SQL transformation is more advanced than the EG query window. eg for performance tuning, multi-column outer joins (changed under EG4.3?)
- No proc append transform in EG

I find EG preferable to DIS for speed of development and debugging, although DIS4.2 appears much improved in this regard (along with some other nice enhancements).

For most of the differences I have listed, there are alternative solutions or workarounds. As Chris mentions above, the scale of your ETL tasks and whether or not you choose to adopt the 'discipline' are probably more important factors.

View solution in original post


All Replies
Community Manager
Posts: 2,888

Re: DI vs EG

Here's what I tell people (and I'm in R&D remember, not marketing...):

Data Integration (or Master Data Management, or ETL, or whatever you call it...) is more than just using tools to manage data -- it's a discipline. The DataFlux and Data Integration Studio solutions from SAS support that discipline as part of their workflow.

EG supports many of the same activities associated with ETL, but doesn't facilitate the workflow often required for large scale data management.

Chris
Solution
‎09-05-2017 11:10 AM
New Contributor
Posts: 3

Re: DI vs EG

[ Edited ]

For ETL activities, especially those on a larger scale, I prefer the rigour that DIS brings. It's a complete metadata-driven solution.

On a practical level, I have found the following differences when using EG4.1 for ETL tasks:
- EG does not have a loop construct
- EG does not have pre-built auditing, pre and post processing, data validation, lookup, and data loader transformations.
- Creating and using indexes in DIS is a couple of mouse clicks......more difficult in EG.
- Deploying code is more cumbersome in EG. For each job, an EG project must first be run, exported, and then deployed in SAS management console (if using Platform scheduling). DIS allows multiple jobs to be deployed with a single 'click'.
- Modifying EG projects can be a little tricky. An EG project can't be edited until it has first been run in the current session (except for custom code). If custom code is being edited, the project must be re-run to ensure the correct code is exported. An easy step to forget.
- DIS SQL transformation is more advanced than the EG query window. eg for performance tuning, multi-column outer joins (changed under EG4.3?)
- No proc append transform in EG

I find EG preferable to DIS for speed of development and debugging, although DIS4.2 appears much improved in this regard (along with some other nice enhancements).

For most of the differences I have listed, there are alternative solutions or workarounds. As Chris mentions above, the scale of your ETL tasks and whether or not you choose to adopt the 'discipline' are probably more important factors.

Super Contributor
Posts: 414

Re: DI vs EG

A very fundamental difference is that EG is based on the physical structures of the atrifacts you're working with while DIS is based on their registered metadata. This means DIS is better equiped for a formal workflow of design-build-test while EG is more targeted at ad-hoc, analytical and end-user work.

What we often do is to prototype our ideas and concepts in EG and then implement them in DIS for formal development.
Contributor
Posts: 71

Re: DI vs EG

What does DIS use for scheduling tasks...Platform Manager / LSF?

And does anyone else use a different 3rd party task scheduler instead?

Thanks.
Occasional Contributor
Posts: 9

Re: DI vs EG

I think what is more importat is to define if the information that will be used is on enterprise source or a local or stand-alone source.
If you need to build process that will read shared information on a enterprise repository I recommend DI.

I can to say that Guide is very useful for lab experiments, to generate short reports or short process for atend of temporal needs rather, use DI for robust processes, that will use information from enterprise source, will extract them and will transform them for put on another enterprise reporistory.

🔒 This topic is solved and locked.

Need further help from the community? Please ask a new question.

Discussion stats
  • 5 replies
  • 1663 views
  • 0 likes
  • 6 in conversation