For large data management projects it’s not uncommon for customers to deliver long and detailed requirements for data access and data integration. Data source access options for SAS data management offerings are plentiful and well documented but information on data integration options—the means for software applications to communicate and share data—are not as easy to come by (at least not all in one place).
So let’s forget about data access options for now (you have lots of them!) and focus on data integration. SAS customers are fortunate to have several data management applications to choose from that each offer a distinct set of features and play well with each other and other non-SAS systems. Choosing the right technology for your data management project will depend on, among other things, the integration options provided by each. Sometimes a combination of data management applications might be needed to match the requirements of the project and the varied skills of potential users. Here’s a quick guide to help you make the right technology choice for your project.
The Data Management Platform (DMP) includes Data Management Studio, Data Management Server, and several other additional modules and content libraries. Taken together, these components provide a robust environment for developing and deploying data quality-centric processes. Unique to this environment is access to sophisticated data profiling, data quality and data enrichment algorithms; the ability to deploy processes in batch and real-time modes; and an adroitness with heterogeneous IT environments. Here are just a few of the data integration features of this solution:
Data Integration Studio is visual design tool for building and deploying data integration processes. Its distinctive features are its SAS code generation underpinnings, a multitude of built-in transformations, close integration with SAS metadata and lots of other enterprise ETL capabilities. Much of what you can do in Data Integration Studio can be done with manual SAS programming but you get little of the code manageability and none of the rich graphical user interface that Data Integration Studio provides. Among its features are these integration capabilities:
Visual Process Orchestration is a web application that can tie together various SAS code-based and non SAS code-based data integration and data management processes in a visual data workflow environment. It is differentiated by its web-based user interface and built-in logical data flow processing. It also nicely spans the Data Management Platform and SAS code-based data management technologies. Visual Process Orchestration has the following integration options:
You can see that these applications have many ways to communicate with other technologies and with each other. You could for example construct processes that interact with each other like this:
I’ve only scratched the surface of the deep set of features provide by the applications discussed here. Understanding the technology options you have for data management and the interplay among these applications is the first step in making the right choice for your project.
Do you have any projects to share where you had to use two or more of these technologies in an innovative way? Did you use the integration features listed here or maybe a few that I missed?
I received the following question about this article:
Could you give an example of a data management project that would use each of the alternatives?
Here's my quick take:
It's been 14 years since SAS acquired Data Flux. And still the product line aren't integrated, must be some kind of record?
Or does SAS not want to integrate them? It's hard to tell by new releases where SAS is going with the Data Management/Integration offering(s). But as for today, I consider it a best of breed, not an end-to-end offering.
Hey Linus,
As the product manager for SAS Data Management, allow me to address your comments.
To understand why things are the way they are, it is helpful to understand the history of SAS and DataFlux. When DataFlux was originally acquired, it was established and run as a fully independent subsidiary of SAS for more than 11 years. DataFlux had its own customers, independent of SAS, and a result, had different market requirements that it was being positioned to satisfy. This explains why DataFlux technology and SAS DI were developed independent of one another and why they were not integrated during this time.
As the data management market changed and evolved, the strategy of having an independent DataFlux also changed, and in 2012, DataFlux was closed as a subsidiary and its products and people became part of the newly established SAS Data Management division. Since then, subsequent product releases have largely focused on integrating the two product stacks and eliminating the "DataFlux" and "SAS" approaches and offering a single "SAS Data Management" approach and offering.
With that said, there are some things to keep in mind:
So, to sum up my response to your question - I can appreciate why you have some questions about what direction things are headed for SAS Data Management. I can also assure you that our current offerings as well as our roadmap reflect our commitment to a single, fully integrated data management stack for all of SAS. Expect to see even more work to come from SAS Data Management that demonstrates this approach.
Regards,
Mike F.
Thank you for your adequate and informative answer. I understand that you have some issues. But data flux has been bundled with data integration from almost the beginning. So I don't think most SAS customers/partners have seen it as a separate company.
So I take from your answer that the data flux leg will be the foundation of this data management platform. And I have seen very neat thing coming out here like business rules mgr.
But still. Meta data and data integration is the foundation, bread and butter of all major data warehouses. So I think need to address where you are going with data integration in that context. We still remember the horror of moving from WA to ETLS 😉
Cheers,
Linus
Excellent article and discussion; I'll certainly be sending it along to many of my colleagues.
To me, a discussion of the SAS data integration toolkit has to include the Base SAS facilities. In my opinion, this is a key differentiator of SAS from other products.
What I'm referring to is the "95%" problem; I can get 95% of what I need quickly and effectively using the interactive, powerful tools. Now what do I do about the last 5%?
With most products, I need to drop down into C or VB to develop it. Both of these languages are EXTREMELY low-productivity for data management tasks. Or I can try do develop a SQL stored procedure to implement the requirement, usually leading to a piece of SQL that is highly inefficient and completely unmaintainable.
With the SAS toolkit, all I need to do is find a competent Base SAS programmer, and they can use Base SAS, which is designed from the ground up as a productive tool for data management tasks, to fill in the gap. This code can then be exposed as a custom stored process, and be integrated into the solution.
Thanks again, everyone!
Tom
As a follow-up to this thread, I wanted to point out some of resources that exist for folks who have questions about Data Management product vision or roadmap.
First, you can post your question to this Community site, and I encourage anyone to do so. Myself or others here at SAS will do our best to answer your questions as forthrightly as possible for the benefit of everyone, however for some specifics, a direct, offline follow-up may be required.
Second, you can check out the papers and proceedings from each SAS Global Forum for hints of things that are coming and the strategic direction of the product. There are clips up on YouTube where we have folks on our R&D team talking about new features and in some cases, previewing future product.
Third, all SAS customers can request a vision and roadmap presentation from their SAS account team. This is useful if you need a more personalized set of details specific for the product(s) that your organization licenses.
I'm sure others exist but just wanted to note some of the best ways to get this information. Thank you and keep the discussion going!
Mike F.
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 16. Read more here about why you should contribute and what is in it for you!
Data Literacy is for all, even absolute beginners. Jump on board with this free e-learning and boost your career prospects.