As organisations are migrating to SAS/Viya, many SAS/DataFlux users are wondering how much functionality is available now in SAS Studio Flows that can replicate the functionality in DM Studio Data Jobs. And can it be done without resorting to Code Steps in the Flow?
The answer is a definite YES! A great many DMStudio Data Jobs can now be replicated as no-code SAS Studio Flows.
Here's a screenshot of one of a portion of my favourite demo DMStudio Data Job that does a variety of Data Quality operations, Address Verification, Matching, Clustering, and Survivorship:
With SAS/Viya I can now replicate the functionality of this flow in a SAS Studio Flow with a combination of standard steps and Custom Steps.
(refer to the screenshots farther down in this article. I also have the sample data and .JSON file exports available so you can import this demo into your own environment if you like).
As of SAS/Viya release 2024.09, there are now standard Steps for SAS Studio Flows that fill the gaps that were previously there in Viya:
There are also many SAS Studio Custom Steps published in the Github repository (https://github.com/sassoftware/sas-studio-custom-steps).
Some Custom Steps that DMStudio users will find useful include:
Even the EEL code in DataFlux Expression nodes can be moved into SAS PROC EEL code, as described here:
This means that a great many DMStudio Data Jobs can now be replicated as no-code SAS Studio Flows.
Below are a couple of screenshots of my SAS Studio Flow, which shows a typical use case of Data Quality and Entity Resolution using Steps like: Branch Rows, Clean Data, Parse Data, DQ - Identify, Verify with Loqate, Calculate Columns, Manage Columns, Union Rows, Match Codes, DQ - Clustering, and DQ - Surviving Record.
It uses sample fictional data from the USA and Canada.
The first screenshot (below) shows the flow at the first node (incoming ACCOUNTS table), previewing the incoming data with obvious data quality issues and duplicate contact information in the 22 rows of sample data.
The next screenshot (below) shows the flow Previewing the resulting data at the end, after survivorship (using a simple record rule - best Address Verification Code), reducing the 22 rows down to 7 surviving rows.
If you want to take a closer look at all the steps in this flow you can import this demo flow, demo data, and the Custom Steps used, into your own environment. I would just need to email you three .JSON export files and the ACCOUNTS.CSV file of sample 22 rows of data. Please message me here if you would like to get these assets.
If you need guidance on how to import these assets into your environment please refer to this SAS Communities article:
Migrating SAS Studio Flows containing Custom Steps - SAS Support Communities
For details about many of the steps in the flow you can also check out this excellent article here on SAS Support Communities written by my colleague Patric Hamilton: Data Brilliance Unleashed: SAS Data Quality against Databricks - Preci... - SAS Support Communities
This is exactly what I've been seeking for some time: a relatively easy means to replicate my desktop SAS-DM Studio canvasses into a more cloud-oriented platform i.e. SAS VIYA. Having data quality as the foundation of a good "data supply chain" with the ability to seamlessly move data from stage-to-stage is paramount, and this provides concise guidelines and ideal visuals to achieve just that. The examples given are easy to follow and maintain a logical sequence of functional nodes both in Viya and in the originating SAS-DM image at the start.
The rapid growth of AI technologies is driving an AI skills gap and demand for AI talent. Ready to grow your AI literacy? SAS offers free ways to get started for beginners, business leaders, and analytics professionals of all skill levels. Your future self will thank you.