SAS' scheduling capabilities have come a long way since the Warehouse Administrator AT days. SAS now supports multi-job flows with complex dependencies, complex execution schedules, sub-flows, command line integration, and more.
Sample Job Flow
Select any image to see a larger version.
Mobile users: To view the images, select the "Full" version at the bottom of the page.
However there is still one feature we have to implement ourselves, event based triggers. In this article, we'll explore creating event-driven processing with just SAS Environment Manager Jobs & Flows and some simple Linux commands.
In this scenario we want to trigger a SAS flow as soon as a third party process completes. Our SAS flow uses the output from the third party job as input. However we're told the process could finish anywhere from 9:00PM, to 1:00AM. We don't want to wait around until 1:00 every morning to start our flow though. We need to trigger when the job completes.
To build our Event-based trigger, we'll combine a few different elements.
First we'll ask the owners of the third party process to output a file when it completes. We'll call this a "process completion file." It is usually an empty text file but may contain some metadata-type information like the process completion time and/or how many records were written to the output files. Its sole purpose is to signal that the process is complete and its output is ready. Our file will be named "factTableComplete.tag." The creation of this file is the event we'll use to trigger our process.
Next we'll supplement our SAS Flow with a Command Line Action node. In it, we'll issue a Linux command to look for the trigger file. Our command will look like this:
'find "/gelcontent/data/factTableComplete.tag"'
With the command line action, our job flow will look like this:
Using bash, the Find command will issue a return code of zero if it finds the file. If it does not, it will issue a one. So, we'll only let the downstream jobs execute if we get a zero. Thus, the downstream jobs will only execute if the 3rd party job is complete. Note that, we have three jobs directly dependent on the command line action. So we'll have to set this condition for all three dependencies.
Now we'll schedule our job flow to run every 5 minutes for 60 occurrences. This gives us a 5 hour window for the flow to find the trigger file. Each execution before the trigger file arrives will end when the file is not found. When the file is found, the down stream jobs will execute.
5. Another Command Line Action to Remove the Trigger File
Once the trigger file is located and the job flow runs beyond the initial Command Line Action, we only want the job flow to run once. So we'll have to remove the trigger file so the down-stream jobs don't run every 5 minutes. We'll also set the file deletion action to start when the dependent jobs start, not when they end. This will ensure that the file is removed in time for the flow's next execution.
With this scheme in place and running, here is a sample of the flow execution history. First we see two executions where the flow was "Canceled." Digging deeper into the second canceled flow, we see that the 3rd party completion file wasn't found ending the flow. Then the completion file lands and the next flow execution succeeds because the file was detected and the down stream jobs executed successfully. Finally the flow goes back to the cancelled state since the completion file is not located again as the previous execution removed it.
Find more articles from SAS Global Enablement and Learning here.
Yes, we are missing this feature as well. Is this explanation also applying to Viya 4 on kubernetes?
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Data Literacy is for all, even absolute beginners. Jump on board with this free e-learning and boost your career prospects.