Introduction
In January 2024, we held a #DataOpsWeek event with the aims of showcasing our work, learning from our colleagues and building the SAS DataOps community. Each day had a theme, based on Alexey Vodilin's 5 pillars for a successful DataOps strategy blog. Our second theme of the week was "Orchestration".
Orchestration
With SAS Viya, we are redefining the boundaries of how you think about Orchestration strategies on SAS products. Aside from the out of the box scheduling capabilities of SAS Viya, we have now support a integration with Apache Airflow. A lot of our talks on the day focussed on how this can be exploited to cater expecially for complex scenarios and multi-integration touch points.
Alexey explains in his blog why it is important to think about a holistic picture when you are defining Orchestration principles:
"It's not enough to produce siloed artifacts and share them in a central repository. To solve a specific business use case, artifacts of several types should be seamlessly orchestrated together, starting from a successful data ingestion, proceeding with building or retraining a machine learning model, and ending up with communicating insights back to the business application.
This is where orchestration technologies come into play. Orchestration processes allow integrating assets of multiple types (like data transformation flows, models, API calls, etc.) in a value pipeline that can serve as a single pane of glass for understanding, building and monitoring a specific process. As such, orchestration flows should also be versioned and shareable."
Top Orchestration resources
I have compiled a list of top resources which will support Orchestration enablement for your teams.
Airflow:
SAS Viya and Apache Airflow in Kubernetes: A Peaceful Coexistence – Nicolas Robert
What’s New in SAS Airflow Provider – Nicolas Robert
Waiting for Something to Occur before Triggering SAS Jobs: Airflow Sensors – Nicolas Robert
Using Apache Airflow to automate SAS Viya administration tasks – Gerry Nelson
Workload Management:
Scaling to new heights. Exploring the auto-scaling capabilities of Viya Workload Management - Eoin Byrne
Argo Workflows:
SAS administration command-line interface in a container: Part 2 Kubernetes and Argo Workflow – Gerry Nelson
SAS Studio Custom Steps:
Switch on, switch off: run-time control of SAS Studio Custom Steps - Sundaresh Sankaran
Python:
Python Tools for SAS Viya
Reinforcement Learning:
How we removed hurdles to adoption - Sundaresh Sankaran
Summary
Finally, if I had to choose just one orchestration tool that we mentioned that day, it would definitely be Airflow. I’m impressed with the constant development of our SAS Airflow Provider and the way we can use Airflow with SAS Viya. Not only can we schedule and run SAS Flows, but we can also use it to automate some administration tasks!
I am also looking forward to further conversation with exploiting Viya Workload Management to know how you manage your costs and infrastructure workloads.
You can find resources from our other DataOpsWeek themes here:
Collaboration
Orchestration (this blog)
Continuous delivery
Testing automation
Environmental management
... View more