BookmarkSubscribeRSS Feed

SAS IRM: Demystifying Job Flows for Beginners

Started ‎09-25-2023 by
Modified ‎09-25-2023 by
Views 688

Introduction

 

Quite often as when working with one of the SAS Risk Solutions, we are required to work with preloaded job flows that are surfaced in the respective solution’s UI. There is an infinite desire to learn how are the diagrammatic representations of job flows are created, and how the nodes in a job flow magically perform all the calculations. This is a post that helps explain SAS Infrastructure for Risk Management (IRM) job flows.

 

BPMN Job Flows

In SAS IRM, we use Business Process Modeling Notation (BPMN) to construct the job flows. It is a common notation for defining executable business process diagrams. SAS IRM uses a small subset of BPMN to create job flows. A diagram view of a typical job flow in SAS IRM looks like:

 

sd_1_SASIRM_bl02_jobflow01.png

 

Job flows are graphical representations of a sequence of calculation steps. Some job flows contain tasks that are also complete job flows themselves – these are called subflows (we will not be talking about subflows in this post).

Nodes in a Job Flow

 

sd_2_SASIRM_bl02_nodes01-300x233.png

 

 

Nodes or tasks are the basic building blocks of SAS IRM. A task requires defined input and/or output objects. Input data objects are displayed above the task and these are connected with an arrow pointing to the task. Output data objects are displayed below the task with arrows leading away from the task.

BPMN Generator Excel File

Job flow definitions in SAS IRM can be created in two ways. The first method involves using the scripting client in SAS IRM – this is the recommended method. The second method involves using a macro enabled MS Excel file – this is a method used mostly for learning purposes.

With the scripting client, the SAS IRM leverages several types of macros (both foundational and SAS IRM specific macros) to automatically create the job flows and surface them in the SAS IRM UI. This method completely shields us from the background activities related to job flow creation, registering of job flows in SAS IRM, mapping the appropriate federated area with a job flow, etc.

On the other, using a macro-enable Excel file for creating BPMN job flows we can learn each one of these steps individually. The macro-enable Excel file called BPMN Generator.xlsm is usually shipped with all SAS risk solutions. It can be found in the tools folder in the registered federated areas.


Relevant Federated Areas

A federated area is a folder structure that conforms to specific SAS IRM rules. It is a storage area for content, and it is the mechanism by which developers add custom content to SAS IRM. By default, SAS IRM provides a platform federated area and a sample federated area. The sample federated area is what we will use for creating our job flows.

 

sd_3_SASIRM_bl02_relevantfa01-300x148.png

 

 

The relevant federated area folders for creating a job flow using BPMN Generator.xlsm file is:

 

 

 

  1. config – this folder contains:
    1. job_flow_definitions.csv – this file is used to register job flows in the SAS IRM system,
    2. jobflows.properties – this file is used to modify the labels for the objects in a job flow.
  2. jobflow – this folder is where all our BPMN files are stored.
  3. source – this folder is where we will store our program codes associated with the nodes in a job flow. To be precise, the program codes are stored in the \source\sas\nodes folder.

Custom Job Flow Details

 

sd_4_SASIRM_bl02_jobflow02.png

 

The job flow that we are attempting to create here consists of two different tasks that will ultimately calculate the net present value (NPV) of a future stream of values. The two tasks are:

  1. Task 1 will create a copy of the CASHFLOWS data in the Landing Area of the Sample.3.6 federated area.
    1. INPUT – STAGING.CASHFLOWS
    2. OUTPUT – TMP.CASHFLOWSNote:The STAGING library points to the landing area folder the sample.3.6 federated area; the TMP library points to the location where the system environment variable TMP is mapped to.

  2. Task 2 will calculate a new column that will capture the NPVs.
    1. INPUT – TMP.CASHFLOWS
    2. OUTPUT – TMP.DIS_CASHFLOWS


Creating the Job Flow using BPMN Generator Excel File


To create the job flow described above we will first create it outside of SAS IRM and then copy it into SAS IRM (we want to rule out the possibility of crashing the entire SAS IRM server in case something goes wrong while creating the BPMN file).

  1. Partially recreate the sample.3.6 federated area folder structure on the desktop.
    1. Create a folder on the desktop and name it Jobflow Creation.
    2. Within the Jobflow Creation folder, create three subfolders and name them:
      1. jobflow
      2. source
      3. tools
    3. Copy the BPMN Generator.xlsm file from the tools folder in a relevant federated area located at
      \<sas-configuration-directory>\Lev1\AppData\SASIRM\fa.<federated-area> to the Desktop > Jobflow Creation > tools folder.
    4. Within Desktop > Jobflow Creation > jobflow, create a folder and name it demo.
    5. In the Desktop > Jobflow Creation > source folder, create a folder called sas, and within this sas folder, create another new folder called nodes.At this point our Jobflow Creation folder on the Desktop should look something like this:
       

      sd_5_SASIRM_bl02_desktopfolder01.png

       

  2. Create the SAS programs for the nodes/tasks shown in the job flow diagram earlier.
    1. Open SAS Studio.
    2. Open a new program window and type the following code:
      data tmp.cashflows;
      set staging.cashflows;
      run;
      
      1. The code makes a temporary copy of the CASHFLOWS data from the STAGING library to the TMP library.
      2. TMP is a temporary library mapped onto the TEMP folder in the OS.
      3. STAGING library points to the landing area of the federated area we are working on.
    3. Save the program as job1.sas in the nodes folder located at
      \Desktop\Jobflow Creation\source\sas\nodes.
    4. Open another program window in SAS Studio and type the following code:
      data tmp.dis_cashflows;
      set tmp.cashflows;
      NPV_AMT=exp(-Discount_Rate * TimeToExpiration) * FACE_VALUE_AMT;
      run;
      1. The code above calculates the Net Present Value (NPV) after applying the appropriate discount rate on the TMP.CASHFLOWS data.
    5. Save the program as job2.sas in the nodes folder located at
      \Desktop\Jobflow Creation\source\sas\nodes.
  3. Create a BPMN file for the job flow described earlier. The BPMN Generator.xlsm file contains worksheets – Data, Checks and Report. We will use the Data worksheet to enter the required values. The information entered in the Data worksheet should look like the following:
     

    sd_6_SASIRM_bl02_bpmngen01-1024x218.png

     

    For better visibility the information from the screen shot above is presented in tabular form below:
     


    sd_7_SASIRM_bl02_bpmngen03.png

    1. Each row in the BPMN file contains information about an object from the job flow.
    2. The BPMN Path column should contain values relative to the \Jobflow Creation\jobflow folder on the Desktop. We also include the name of the job flow in the BPMN Path.
      So, the BPMN Path for all the objects in the job flow is the same: demo/demo_jobflow.bpmn, where demo_jobflow.bpmn is the name of the BPMN file.
    3. The ID column must have unique string values to identify different objects from the job flow. Because our first object is the START object, we specify the ID=start.
    4. The TYPE column requires us to enter specific keywords to instruct SAS IRM what type of object are we are creating. These values must be entered in uppercase. For the first object, we enter TYPE=START in uppercase.
    5. In the LABEL column, we indicate the label of the object, so that it is displayed in the SAS IRM diagram view of the job flow. The start and the end nodes do not have any labels to display, but we cannot leave the LABEL column blank, so, we use a period (i.e., LABEL=.).
    6. The next object in the job flow is the first task (Task 1). We start from the next row in the Data worksheet of the Excel file and enter ID=job1.
    7. This is a node that will execute some code. This can be done by specifying TYPE= SERVICE TASK.
    8. Specify LABEL=Copy fact table for cash flows for this object.
    9. Next, we need to specify the connection between the START and Task 1 node. We can do this by specifying start (i.e., the ID of the previous object) in the predecessor01 column; so, predecessor01=start.
    10. After this we need to let SAS IRM know what type of program, we are going to execute for Task 1. Because we are going to execute a SAS program, enter type=sas in the SourceType column.
    11. In the SourcePath column, we specify the name of the program that is going to be executed as the first task. Enter source=job1.sas in the SourcePath column. This macro-enabled Excel file knows that it has look for job1.sas file in \source\sas\nodes folder within the root (Jobflow Creation) folder.
    12. The next object specified in the excel file is that of the Input data for the first task. We specify the following values:
      1. BPMN Path=demo/demo_jobflow.bpmn
      2. ID=job1_in1
      3. Type=DATA OBJECT
      4. Label= STAGING.CASHFLOWS.SAS7BDAT (the extension is required)
      5. Input of=job1
    13. In this way, the table above can be used to understand how the information for the remaining objects in the job flow were populated in the excel file.
    14. After every single object’s information has been entered in the BPMN Generator.xlsm file, click on the Fill Down Checks (takes time) button at the top to validate whether all the information entered in the Excel file has been done correctly.

       

       

      sd_8_SASIRM_bl02_bpmngen02.png

       

    15. If there are no warnings and no errors, then the indicators for NO WARNINGS and NO ERRORS should both display green color.
    16. When both the checks (errors and warnings) are green, click the Generate BPMN button.
    17. After this, a pop-up window is going to ask us to specify the BPMN root folder. It should already be populated as "..\jobflow". Click OK to accept the default value.
    18. When the BPMN file has been successfully created, another pop-up window informs us of the same and directs us to inspect the Reports worksheet.
    19. We can also inspect the \Desktop\Jobflow Creation\jobflow\demo folder to locate the demo_jobflow.bpmn file.
  4. Next, we copy all the relevant files from the Desktop to the SAS IRM server.
    1. Copy all the relevant files from the Jobflow Creation folder on the desktop to a federated area. Instead of creating a new federated area, we will use one of the production level federated areas that is already registered with the metadata in SAS IRM. We will use the Sample.3.6 federated area to achieve this.
    2. Create a new folder called demo within the jobflow folder of the Sample.3.6 federated area.
    3. Copy the demo_jobflow.bpmn file from \Desktop\Jobflow Creation\jobflow\demo folder to the \jobflow\demo folder in the Sample.3.6 federated area.
    4. Copy the job1.sas and job2.sas files from the \Desktop\Jobflow Creation\source\sas\nodes folder to the \source\sas\nodes folder in the Sample.3.6 federated area.Note: Sample.3.6 federated area is located at \<SAS-config-directory>\Lev1\AppData\SASIRM\fa.sample.3.6
  5. Modify the job_flow_definitions configuration file to make the new job flow available in the New Instance window.
    1. Right-click the job_flow_definitions.csv file located at \<SAS-config-directory>\Lev1\AppData\SASIRM\fa.sample.3.6\config and select Edit with Notepad++ (or any other text editor).
    2. Include the following line at the bottom of the job_flow_definitions.csv file:
      demo,demo_jobflow,both,sample_36_configuration
      
      1. The first entry demo shows up as an additional entry in the Category field of the New Instance UI. This is the name of the folder in Sample.3.6 federated area where the new job flow is stored.
      2. Next, we have demo_jobflow, which shows up in the Flow field when we select demo in the Category field.
      3. The third term both is used to instruct SAS IRM whether the job flow is relevant for solo or group or both Entity roles.
      4. The final part of the entry (after the last comma) is to indicate under what configuration set will this particular job flow will appear in SAS IRM UI. This job flow will appear only when we select SAMPLE_36_CONFIGURATION in the Configuration field of the UI.
    3. Save the job_flow_definitions.csv file.
  6. Restarting the SAS IRM server.
    1. Open windows services (services.msc).
    2. In the list of services displayed alphabetically, locate the service named SAS[Config-Lev1]SASServer8_1 – WebAppServer.
    3. Right-click on the SASServer8_1 service and select Restart. The SAS IRM server restarts in about 3-4 minutes.
  7. Create a new job flow instance with job flow definition just created.
    1. Log in to SAS IRM.
    2. In the SAS IRM home page (the instance list view), click the New Instance icon.
    3. Enter the following information/settings in the New Instance window:
      1. Instance – demo_jobflow 1
      2. Base Date – March 31, 2019
      3. Configuration – SAMPLE_36_CONFIGURATION
      4. Category – demo
      5. Flow – demo_jobflow
      6. Description -- <Leave Blank>
      7. Entity – ENTITY BE
      8. Entity Role – Solo
      9. Federated Area – Sample.3.6
    4. Click Create on the top right of the screen to create the job flow instance.
  8. Viewing the demo_jobflow 1 instance.
    1. In the instance list view window, locate the demo_jobflow 1 instance.
    2. Double-click to open the job flow instance.
       


      sd_9_SASIRM_bl02_bpmn04.png

       

    3. When inspecting the diagram view of the job flow instance, we notice some labeling issues. Only the input table for the first task/job has a meaningful label.
  9. Adding labels to objects in a job flow definition.
    1. In addition to the three unlabeled tables identified above we also want to label the BPMN file so that it displays the label and not the actual BPMN file name in the Flow field:


      sd_10SASIRM_bl02_flow01.png

       

       

    2. The labels used in a job flow definition can be added/modified in the jobflows.properties file located at
      \<SAS-config-directory>\Lev1\AppData\SASIRM\fa.sample.3.6\config\messages.
    3. Open the jobflows.properties file using Notepad++.
    4. In the section titled JOB FLOW FILE NAMES, add the following line at the end:
      demo_jobflow.bpmn=Demo Job Flow for Discounting Cash Flows
      
    5. In the section titled TABLE LABELS, add the following lines at the end:
      #Labels for TMP
      TMP.cashflows=Copy of fact table for cash flows
      TMP.dis_cashflows=Discounted fact table for cash flows
    6. Save the jobflows.properties file.
    7. Restart the SAS IRM by restarting the SAS[Config-Lev1]SASServer8_1 – WebAppServer service.
    8. After the SAS IRM server restarts, open the demo_jobflow 1 instance again. We notice that the three unlabeled tables from earlier are now labeled.


      sd_11_SASIRM_bl02_bpmn05.png

       

       

       

    9. In addition, the New Instance window now displays the label assigned for the BPMN and not the actual BPMN file name.


      sd_12_SASIRM_bl02_flow02.png

       

       

       

Additional Information

For a more in-depth training on SAS IRM refer to the course SAS Infrastructure for Risk Management Overview

Comments

Hi @SoumitraDas ,

Thanks for the guide that is really simple to read and got it.

I need an help for a task. I have to add an input tables in a Sas nodes but I really don't understand how can I add a simple table from a libraries in the /param_in.

Do I need to change the .bmpn file? What is the best and simplest option?

Thanks

 

Hi @Balda981, if you plan to use this method (BPMN Generator) then it is a matter of replicating the entries for specifying an input data (as shown in the relevant screenshot above) using the three-part nomenclature (LIBRARY.DATASET-NAME.DATASET-EXTENSION) in the BPMN Generator file to modify resultant BPMN file - otherwise the input data will not show up in the diagrammatic representation of the modified job flow.

 

You will also need to create/modify the SAS code that is working behind the relevant task/node in the job-flow to specify your data, say for example, in a SAS DATA step so that when the code executes it will pick up your data file as the input data.

 

What is important to bear in mind is that whenever we specify a library in SAS IRM (in this case the relevant library is located at /param_in) it should be first specified the LIBNAMES.TXT file (located in the CONFIG folder of the federated area you are working on) for SAS IRM to recognize it. In the example that I showed above, there was no need to do this because the STAGING library was already defined in the LIBNAMES.TXT file, while TMP was a SAS system defined library pointing to the TMP folder in the operating system.

 

Please note, as I already pointed out in the writeup, the BPMN generator is not the recommended method for creating job-flows in SAS IRM - the SAS scripting client is the recommended method. Thanks.

 

Hi @SoumitraDas ,

Unfortunately I still can't insert a new table at the desired node.
I have an already developed SAS IRM solution, with the jobflows already integrated and arranged and I would like to add and map a new library to these to obtain a new table to use within the code.
I tried as you suggested to insert the library into the libnames.txt in this way (RRM_LAN=%la/%bd/%et/airb) and then to place the param_in file in the node ( \param [in] %in_gestionale= rrm_lan.gestionale.sas7bdat : ​​test). But when I run it, it breaks the engine because, ERROR: Libref RRM_LAN is not assigned..
I would need to understand how to recover that necessary table within the node and above all how it can be done on an already existing solution, without creating new tasks. I read in the guides that perhaps Doxygen is needed, but how is it used? Works?

Hi @Balda981,

 

The error message is indicating that the RRM_LAN library is not assigned. This is most likely a problem with the path "%la/%bd/%et/airb" defined for the libref. One problem could be related to the "%et" parameter which resolves to the entity selected when creating the job flow. For using "%et", the name of the entity folder must be the lowercase value of the ENTITY_ID. Usually, the entity folders contain multiple base date folders as each entity could be run for multiple time periods. But, in your case this seems to be the other way around - your entity folder is defined within the base date folder (%la/%bd/%et/airb).

 

There could be several other reasons that could cause a failure to assign the required library like the libref in question may have been defined for a lower federated area and not in a higher federated area, and so on.

 

Your issue seems quite technical and requires more information to come up with a resolution. I recommend reaching out to the SAS technical support team or the SAS implementation team at your site for more specialized assistance.

 

Thanks

Version history
Last update:
‎09-25-2023 06:13 AM
Updated by:
Contributors

Ready to join fellow brilliant minds for the SAS Hackathon?

Build your skills. Make connections. Enjoy creative freedom. Maybe change the world. Registration is now open through August 30th. Visit the SAS Hackathon homepage.

Register today!

Free course: Data Literacy Essentials

Data Literacy is for all, even absolute beginners. Jump on board with this free e-learning  and boost your career prospects.

Get Started

Article Labels