Just like with movie sequels — you don’t have to watch the original to follow the plot, but it definitely helps. Seeing the first film gives you better insight into the characters, background, and little details that make the story richer. The same goes for this blog. It’s totally fine to start here, but if you want the full picture, reading the original paper is a great idea. Still, if you’d rather dive in here first, I won’t hold it against you.
If you are familiar with the original Best Practices for Scheduling in SAS Viya, then you are in for a treat because this document is an updated version of that document. Many people who read the original have requested an updated version. Five years have passed, bringing several changes along the way. This blog details some changes from the original paper, scheduling types available and helps you take full advantage of scheduling with SAS Viya Environment. It will discuss CAS Table State Management, Command Line Interface, Jobs and Flows Page in SAS Environment Manager, SAS Job Execution Web Application, SAS Studio and an introduction of Air Flow.
You can use the following five methods to schedule in SAS Viya: CAS Table State Management, Command Line Interface, The Jobs and Flows Page in SAS Environment Manager, SAS Job Execution Web Application and SAS Studio. Each scheduling method is unique. However, they use the same microservices to execute and schedule jobs – job execution microservice or scheduling microservice. The scheduling microservice starts the job, which is a synchronous operation to the Job Execution microservice that runs the job. This paper explores each method’s characteristics and discusses best practices for scheduling. The article follows an alphabetical organization and we will be broken up in two parts.
CAS Table State Management is a great method to load, unload and import tables quickly. CAS table state management is a feature in SAS® Environment Manager for SAS® Viya®. To access this method, click the Jobs and Flows plug-in on the left navigation pane in SAS Environment Manager. If you are a SAS Environment Manager administrator, you see three sample jobs on the Scheduling tab. You can use these jobs to manage CAS tables.
Chaining Jobs Together is a feature developed after the publication of the original paper. The advantage of chaining jobs together feature is the second job and will start as the first job completes. This feature of sequencing jobs is useful for jobs that are dependent on another job. Figure 1 demonstrates that Load Job is chain to the Unload Job.
Figure 1: Chaining Jobs together
The UP:Unload job is schedule to run after 5 minutes. The UP:Load job is chain to the UP:Unload Job. It takes 45 seconds for UP:Load job to start after the UP:Unload completes. This is quite simple to set up. On the Scheduling Tab, looking at the Job Properties of the UP:Load Job. Copy the ID value, Figure 2.
Figure 2: Job Properties of the UP: Load Job
On the UP: Unload Job under the Job Properties On the Arguments Tab. Edit the successJobID setting with the value of the job you want chained in this case UP: Load. Figure 3 demonstrates the Job Properties of UP: Unload
Figure 3: Job Properties of the UP: Unload Job
Figure 4: Scheduling Tab of the Jobs and Flow page in Environment Manager
Figure 4 displays only the UP: Unload job is scheduled however since the UP:Load is chain it is scheduled as well. Figure 5 shows the Monitoring tab and illustrates this point. This is another advantage of the of Chaining Jobs Together. It reduces the number of jobs that need to be scheduled.
Figure 5: Monitoring Tab of the Jobs and Flow page in Environment Manager
As far as the disadvantage, be careful with the Job SuccessID on this feature. If not careful, an endless loop of running jobs can be created. For example, in this case there are two jobs, UP:Unload and UP:Load respectively. Putting the successJobID for UP: Load and putting the successJobID for UP:Unload, creating each job kicking off repeatedly with only one job being scheduled will cause an infinite running of jobs.
This method is great for programmers. The Command Line interface offers a programmatic way into the scheduling world of SAS Viya. The CLI is an administrative tool for programmers, which can be accessed directly from your machine. You can install the CLI software on your machine, create the profile and authenticate to your Viya server machine.
There are several additional commands for flows in Viya 4 for the CLI feature. The additional commands are cancel, resume, pause, release, show-history-tree, show-status and trigger shown in Figure 6.
Figure 6: Command List for the CLI
The advantage of this feature is there are commands the CLI offers to manage flows that the GUI does not offer and another option to debug job flows. For example, we will use the trigger command. Trigger allows you to run a job flow manually.
Figure 7: Display of a listing of job flows.
Figure 7 is a screenshot which displays the Job ID, Name of the flow, Version and description. Figure 8 is a screenshot of the sas-viya job flows show with the job flow ID, c904f071-c67a-439a-8f76-e91bed0b01dc of the ProcMeans Flow.
Figure 8: Display the command show of job flow, Proc Means Flow
Figure 9: Display of scheduling and trigger the flow
The syntax to schedule a flow is sas-viya job flows trigger --id job_flow_definition_id --sch scheduler_id. I scheduled the job first by using the Job_flow ID. Once I scheduled the flow, it gave me the job_flow_definition_id. Then I went to schedule trigger the job flow with the job_flow_definition_id, 226f1ebf-035c-44ae-a865-2f6dde3dceca and added the parameter of –sch scheduler_ID, 9a3db74b-465f-4d36-baf6-b3b8bad1a3b2 found in Figure 8. There are a couple of ways to check the status of the flow. Figure 10 displays the history of the flow list while Figure 11 displays show history of the flow.
Figure 10 displays the list history of the flow.
Figure 11 displays show history of the flow.
As far as disadvantages, familiarity with JSON is required to use this method. I found the JSON validator to be very helpful. Also, there are a lot of different IDs associated with the scheduler. It takes practice to understand which ID will give you the desired outcome.
You can use the Jobs and Flows page in SAS Environment Manager to schedule or monitor jobs and/or flows. This function is available apart from the left navigation pane of the GUI.
I would like to highlight two additional features. Along with creating jobs, you can now create flows. A Job is defined as a program that contain tasks that you need to run. A flow is defined as a container that can include jobs, flows or a mixture of both. A flow must contain at least one job. Job-Flow Scheduling Service is a microservice that manages job execution. SAS Job Flow Scheduler is a type of scheduler that the Job-Flow-Scheduling Service supports and manages for executing job flows. Jobs can be a part of flows and flows, or sub-flows can be a part of a flow.
Figure 12 displays the options of a flow.
Figure 12 displays the options of a flow. The options as defined:
If this flow was not already scheduled, we will see schedule which means to schedule the flow.
For Event Types, a file event can be a trigger for an event.
Figure 13: New File Event Window
The advantage for users who prefer the point and click over command line have more options within the scheduler. There are options to control the outcome of a flow. Here is a blog on Tips for Scheduling Flows with SAS Viya Environment Manager. Also, you can use a time event and file event combined to start the flow. As far as disadvantages, when combining the time event and file event ensure the logic is true so the flow can start. If either the time or file event does not exist, the flow will not start and can cause an impact for scheduling.
SAS Job Execution Web Application is comparable to a SAS®9 stored process. You can access it from a web browser with the SASJobExecution extension.
Updates to the SAS Job Execution Web Application from the Original Paper
Jobs can be scheduled within the application. There is a new menu option on the left-hand side to schedule a job.
Figure 14 displays SAS Job Execution Window
The advantage is to be able to schedule from the SAS Job Execution App. You can add the parameters jobs within the application like _action=schedule, modify the timeout value and create jobs. These jobs are used for web reporting, performing analytics, building web applications, and delivering content to clients which have prompts. The disadvantage is if you want to monitor the job, then you still need to go to the application of SAS Environment Manager.
SAS Studio is a web-based programming environment for creating SAS code that you can access from SAS Drive.
Jobs can be scheduled within the application. There is a menu option on the left-hand side to schedule a job.
Figure 15 displays Schedule options with SAS Studio
The advantage of being able to schedule from the SAS Studio is great for programmers. After developing code, the option to schedule the job from the same application is beneficial. You can schedule the job whether you save the job or not. If you do not save the program in SAS Studio, the naming convention in the scheduling tab in SAS Environment Manager is SAS Program_Copyn where n is the numeral order of schedule, unsaved programs.
Figure 16 displays Scheduling Tab
The disadvantage is still having to monitor and manage the scheduling from SAS Environment Manager. Monitoring of a job is performed in SAS Environment Manager or modifying a schedule of a previously scheduled job is performed in SAS Environment Manager.
Apache airflow is an open-source workflow management orchestration tool for scheduling, monitoring, and managing data pipelines. Airflow can be deployed in Kubernetes and SAS Viya can take advantage of that same Kubernetes cluster. One of the benefits of Kubernetes is being able to scale up and down necessary. DAG, directed acyclic graph, is a data pipeline in airflow that comes with tasks. There is an application with Airflow that allows you to monitor your DAGS and tasks. Here is a link to a blog titled Scheduling SAS Jobs and SAS Studio Flows with Apache Airflow to learn more about Scheduling with Airflow.
I hope you enjoyed exploring the changes and new features since the original scheduling paper. While some aspects of scheduling have evolved, one thing hasn’t changed: SAS Viya for Scheduling continues to offer a wealth of options and provides flexible, concise, and effortless ways to schedule tasks. No matter your approach, SAS Viya for Scheduling truly has a method that fits your needs.
It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.
The rapid growth of AI technologies is driving an AI skills gap and demand for AI talent. Ready to grow your AI literacy? SAS offers free ways to get started for beginners, business leaders, and analytics professionals of all skill levels. Your future self will thank you.