BookmarkSubscribeRSS Feed

Can SAS/DataFlux DMStudio Data Jobs be replicated in SAS Viya as SAS Studio Flows?

Started ‎12-10-2024 by
Modified ‎02-24-2025 by
Views 8,428

As organisations are migrating to SAS/Viya, many SAS/DataFlux users are wondering how much functionality is available now in SAS Studio Flows that can replicate the functionality in DM Studio Data Jobs.  And can it be done without resorting to Code Steps in the Flow?

 

The answer is a definite YES!  A great many DMStudio Data Jobs can now be replicated as no-code SAS Studio Flows.

 

Here's a screenshot of one of a portion of my favourite demo DMStudio Data Job that does a variety of Data Quality operations, Address Verification, Matching, Clustering, and Survivorship:

Arn_Toporowski_0-1733796625261.png

 

With SAS/Viya I can now replicate the functionality of this flow in a SAS Studio Flow with a combination of standard steps and Custom Steps.

(refer to the screenshots farther down in this article.  I also have the sample data and .JSON file exports available so you can import this demo into your own environment if you like).

 

As of SAS/Viya release 2024.09, there are now standard Steps for SAS Studio Flows that fill the gaps that were previously there in Viya:

  • under Data Quality
    • Clean Data, including tabs for:
      • Standardization
      • Casing
      • Identification Analysis
      • Gender Analysis
      • Patten Analysis
    • Match Codes
    • Parse Data
  • under Integrate:
    • Load Table
    • Merge Table
  • under Enrichment:
    • Geocoding Data
    • Verify with Loqate:
      • Address Verification
      • Geocoding (using Loqate)
      • Phone Verification (with enhanced info compared to DMStudio)
      • Email Verification (with enhanced info compared to DMStudio)
  • Under Transform Data:
    • Branch Rows
    • Calculate Columns
    • Filter Rows
    • Insert Rows
    • Manage Columns
    • Mask Data
    • Query
    • Rank Data
    • Remove Duplicates
    • Sort
    • Split Columns
    • Stack Columns
    • Transpose Daa
    • Union Rows
  • Under Vizualize Data: sixteen types of charts, plots, and maps

 

There are also many SAS Studio Custom Steps published in the Github repository (https://github.com/sassoftware/sas-studio-custom-steps).

Some Custom Steps that DMStudio users will find useful include:

  • Anonymize and Mask Data
  • Append Table
  • CAS – Generate Unique ID
  • Append Table
  • DQ - Change Case
  • DQ - Cluster Analysis
  • DQ – Identify
  • DQ – Create QKB Reference Table
  • DQ – Match Code
  • DQ - Parsing
  • DQ – Surviving Record
  • Export – Parquet
  • Extract Text Features
  • FTP Directory Listing
  • FTP Download Files
  • GeoDistance with Rounding
  • Great Expectations – Execute Rule
  • Great Expectations – Generate Expectation Suite
  • Great Expectations – Run Expectation Suite
  • HTTP Request
  • Import – ADLS File Reader
  • Import – CSV with long column names
  • Import – Data Ingestion Auto Pilot
  • Import – Extract Table from PDF
  • Import – HTML Table
  • Lookup
  • Loop Deployed Object
  • Loop Flow
  • NLP – Extract Entities
  • NLP – Identify Language
  • OCR – Document Analysis
  • Send SMTP Email
  • Send Teams Message
  • Surrogate Key Generator
  • Translate Text

Even the EEL code in DataFlux Expression nodes can be moved into SAS PROC EEL code, as described here: 

https://communities.sas.com/t5/SAS-Communities-Library/SAS-Viya-Using-PROC-EEL-in-SAS-Studio/ta-p/87...

 

This means that a great many DMStudio Data Jobs can now be replicated as no-code SAS Studio Flows.

 

Below are a couple of screenshots of my SAS Studio Flow, which shows a typical use case of Data Quality and Entity Resolution using Steps like: Branch Rows, Clean Data, Parse Data, DQ - Identify, Verify with Loqate, Calculate Columns, Manage Columns, Union Rows, Match Codes, DQ - Clustering, and DQ - Surviving Record.

 

It uses sample fictional data from the USA and Canada.

 

The first screenshot (below) shows the flow at the first node (incoming ACCOUNTS table), previewing the incoming data with obvious data quality issues and duplicate contact information in the 22 rows of sample data.

 

image 25.png

 

The next screenshot (below) shows the flow Previewing the resulting data at the end, after survivorship (using a simple record rule - best Address Verification Code), reducing the 22 rows down to 7 surviving rows.

 

image 26.png

 

If you want to take a closer look at all the steps in this flow you can import this demo flow, demo data, and the Custom Steps used, into your own environment.  I would just need to email you three .JSON export files and the  ACCOUNTS.CSV file of sample 22 rows of data.  Please message me here if you would like to get these assets.

 

If you need guidance on how to import these assets into your environment please refer to this SAS Communities article: 

Migrating SAS Studio Flows containing Custom Steps - SAS Support Communities

 

For details about many of the steps in the flow you can also check out this excellent article here on SAS Support Communities written by my colleague Patric Hamilton:  Data Brilliance Unleashed: SAS Data Quality against Databricks - Preci... - SAS Support Communities

 

 

 

 

 

Comments

This is exactly what I've been seeking for some time: a relatively easy means to replicate my desktop SAS-DM Studio canvasses into a more cloud-oriented platform i.e. SAS VIYA. Having data quality as the foundation of a good "data supply chain" with the ability to seamlessly move data from stage-to-stage is paramount, and this provides concise guidelines and ideal visuals to achieve just that. The examples given are easy to follow and maintain a logical sequence of functional nodes both in Viya and in the originating SAS-DM image at the start. 

Version history
Last update:
‎02-24-2025 11:45 AM
Updated by:

hackathon24-white-horiz.png

The 2025 SAS Hackathon Kicks Off on June 11!

Watch the live Hackathon Kickoff to get all the essential information about the SAS Hackathon—including how to join, how to participate, and expert tips for success.

YouTube LinkedIn

SAS AI and Machine Learning Courses

The rapid growth of AI technologies is driving an AI skills gap and demand for AI talent. Ready to grow your AI literacy? SAS offers free ways to get started for beginners, business leaders, and analytics professionals of all skill levels. Your future self will thank you.

Get started

Article Tags