DevOps Applied to SAS Viya 3.5: Run CAS Programs in Parallel in a Jenkins Pipeline

Started ‎05-06-2020 by

Modified ‎09-07-2020 by

Learn how to optimize a Jenkins pipeline and add parallel stages to run CAS programs in SAS Viya 3.5. In some cases, you can pick a low hanging fruit, just by reviewing the dependencies between CAS programs and ordering them in sub-stages. In other cases, you might need to analyze your CAS programs, re-write them differently, in a way that enables Jenkins stage parallelization. Take the following example:

From the Jenkins pipeline covered in a previous post, DevOps Applied to SAS Viya 3.5: Run and Test CAS Programs with a Jenkins Pipeline: load files in CAS, create a star schema as a CAS view, test the star schema and finally, clean-up.

To a Jenkins pipeline with parallel stages:

(view in My Videos)

When the pipeline is executed:

What is Needed

CAS programs, written to support parallel stages, pushed in GitLab.
A Jenkins file containing the Jenkins pipeline definition, stored in GitLab.
Jenkins running the pipeline with SAS Viya as an agent.

CAS Programs in GitLab

When you parallelize the stages, it must make sense from a functional perspective.

In a first example, you might need to rewrite the SAS code and load one file per SAS program, instead of all files in one SAS program.

In a second example, Jenkins will start parallel testing, only after the CAS star schema view is created. First example: CAS programs to run in parallel in substages:

010_load_catcode.sas
020_load_mailorder.sas
030_load_products.sas
040_load_customers.sas

If these files are stored in a CASLib that allows parallel loading, then you will have CAS parallel loading in Jenkins parallel sub-stages.

Note, these programs will replace a program, 080_load_CAS.sas, which loaded all the files in CAS tables, in one program.

Second example: CAS programs to run in parallel in substages:

200_functional_test.sas queries the star schema with a simple.summary cas action.
300_technical_test.sas queries a status code produced by the previous script.

Programs that will stay the same:

100_create_star_schema.sas creates and queries the star schema, as a CAS view.
900_cleanup.sas drops the tables, deletes source data.

The GitLab project folder might look like this:

4_100-DevOps-Jenkins-parallel-stages-GitLab-CAS-programs.png

The Jenkins File

The Jenkins file is stored in GitLab. Its location is stored in the Jenkins pipeline configuration.

Jenkins builds the pipeline according to the Jenkins file.

A pipeline is composed of stages (and sub-stages). They will run on the SAS Viya machine, defined by the Jenkins agent label. How to define the agent first in Jenkins, see DevOps Applied to SAS Viya 3.5: Run a SAS Program with a Jenkins Pipeline. The new Jenkins file:

Clone GIT on SAS Viya stage prints a simple message.
Copy source files stage copies the source files in a location corresponding to the CASLIB we want to load them in CAS.
One stage for each CAS program we need to execute: Load files in CAS, Create Star Schema in CAS, Perform Tests in Parallel and Clean-up files.
Load files in CAS has one substage for each file to be loaded. It initiates the loading at the same time, in different stages. Note the word parallel {} under the stage name.
Perform Tests in Parallel contains parallel sub-stages:
- Functional Test (query the star schema).
- Technical Test (query star schema creation status).
- Both are dependent on the Create Star Schema in CAS.
A post message, entirely optional.

The new Jenkins file syntax:

pipeline {
    agent { label 'intviya01.race.sas.com'}
        environment 
        {
        userid = 'your-user-here'
        }
    stages {
        stage('Clone GIT on SAS Viya') {
            steps {
                sh '''
                echo "Execution user: " `logname`
                echo "Pipeline user: "${userid}
                '''
            }
        }
        stage('Copy source files') {
            steps {
                sh 'cp -n /opt/sas/devops/workspace/${userid}-PSGEL250-devops-applied-to-sas-viya-3.5/Data-Management/source_data/* /gelcontent/demo/DM/data/'
            }
        }
		stage('Load files in CAS') {
            parallel {
                stage('Load Catalogue Code Dimension') {
                    steps {
                        echo "Load in parallel 1st file catcode.csv"
						sh '/opt/sas/spre/home/SASFoundation/sas -autoexec "/opt/sas/viya/config/etc/workspaceserver/default/autoexec_deployment.sas" /opt/sas/devops/workspace/${userid}-PSGEL250-devops-applied-to-sas-viya-3.5/Data-Management/scripts/010_load_catcode.sas -log /tmp/010_load_catcode.log'
                            }
                        }
                stage('Load Mailorder Fact') {
                    steps {
                        echo "Load in parallel 2nd file mailorder.csv"
						sh '/opt/sas/spre/home/SASFoundation/sas -autoexec "/opt/sas/viya/config/etc/workspaceserver/default/autoexec_deployment.sas" /opt/sas/devops/workspace/${userid}-PSGEL250-devops-applied-to-sas-viya-3.5/Data-Management/scripts/020_load_mailorder.sas -log /tmp/020_load_mailorder.log'     
                            }
                        }
               stage('Load Products Dimension') {
                    steps {
                        echo "Load in parallel 3rd file products.csv"
						sh '/opt/sas/spre/home/SASFoundation/sas -autoexec "/opt/sas/viya/config/etc/workspaceserver/default/autoexec_deployment.sas" /opt/sas/devops/workspace/${userid}-PSGEL250-devops-applied-to-sas-viya-3.5/Data-Management/scripts/030_load_products.sas -log /tmp/030_load_products.log'     
                            }
                        }
              stage('Load Customer Dimension') {
                    steps {
                        echo "Load in parallel 4th file customers.csv"
						sh '/opt/sas/spre/home/SASFoundation/sas -autoexec "/opt/sas/viya/config/etc/workspaceserver/default/autoexec_deployment.sas" /opt/sas/devops/workspace/${userid}-PSGEL250-devops-applied-to-sas-viya-3.5/Data-Management/scripts/040_load_customers.sas -log /tmp/040_load_customers.log' 
                            }
                        }
                    }
                }
        stage('Create Star Schema in CAS') {
            steps {
                sh '/opt/sas/spre/home/SASFoundation/sas -autoexec "/opt/sas/viya/config/etc/workspaceserver/default/autoexec_deployment.sas" /opt/sas/devops/workspace/${userid}-PSGEL250-devops-applied-to-sas-viya-3.5/Data-Management/scripts/100_create_star_schema.sas -log /tmp/100_create_star_schema.log'
            }
        }
		stage('Perform Tests in Parallel') {
            parallel {
				stage('Functional Test') {
					steps {
						sh '/opt/sas/spre/home/SASFoundation/sas -autoexec "/opt/sas/viya/config/etc/workspaceserver/default/autoexec_deployment.sas" /opt/sas/devops/workspace/${userid}-PSGEL250-devops-applied-to-sas-viya-3.5/Data-Management/scripts/200_functional_test.sas -log /tmp/200_functional_test.log'
					}
				}
				stage('Technical Test') {
					steps {
						sh '/opt/sas/spre/home/SASFoundation/sas -autoexec "/opt/sas/viya/config/etc/workspaceserver/default/autoexec_deployment.sas" /opt/sas/devops/workspace/${userid}-PSGEL250-devops-applied-to-sas-viya-3.5/Data-Management/scripts/300_technical_test.sas -log /tmp/300_technical_test.log'
					}
				}
			}
        }
        stage('Cleanup CAS tables') {
            steps {
                sh '/opt/sas/spre/home/SASFoundation/sas -autoexec "/opt/sas/viya/config/etc/workspaceserver/default/autoexec_deployment.sas" /opt/sas/devops/workspace/${userid}-PSGEL250-devops-applied-to-sas-viya-3.5/Data-Management/scripts/900_cleanup.sas -log /tmp/900_cleanup.log'
            }
        }
        stage('Cleanup files') {
            steps {
                sh '''
                    rm -f /gelcontent/demo/DM/data/mailorder.csv
                    rm -f /gelcontent/demo/DM/data/customers.csv
                    rm -f /gelcontent/demo/DM/data/products.csv
                    rm -f /gelcontent/demo/DM/data/catcode.csv
                    rm -f /tmp/010_load_catcode.log
                    rm -f /tmp/020_load_mailorder.log
                    rm -f /tmp/030_load_products.log
                    rm -f /tmp/040_load_customers.log
                    rm -f /tmp/100_create_star_schema.log
                    rm -f /tmp/200_functional_test.log
                    rm -f /tmp/300_technical_test.log
                    rm -f /tmp/900_cleanup.log
                '''
            }
        }
    }
    post { 
        success { 
            echo 'Pipeline optimized and parallelized!'
        }
    }
}

The Jenkins Pipeline

The Jenkins Pipeline from the previous post will be reused as such. The pipeline doesn’t change. What changes is the Jenkins file, the Jenkins pipeline definition. To reuse the same pipeline, over and over, you might choose to store the Jenkins file in GitLab (or any version control system).

Using Blue Ocean, a Jenkins plug-in, to run and visualize the pipelines. Your pipeline will show previous builds, if any.

Run the pipeline.

Load and tests are performed in parallel.

In our example, you can now execute with a time gain, but more important, automate many manual tasks you had to perform. Even if the data volumes are very light, you can still gain time.

The biggest gains are when you save hours of manual tasks to build, test and validate the results.

Conclusions

We optimized a previously written Jenkins serial pipeline with parallel sub-stages.

In some cases, you could pick a low hanging fruit, just by reviewing the dependencies between CAS programs and ordering them in sub-stages.

In others, you might need to analyze the CAS programs, re-write them differently, in a way that enables Jenkins stage parallelization.

More will follow: how to import SAS content, such as SAS Data Studio Plans or SAS Visual Analytics reports.

Acknowledgements

@MarkThomas , @RobCollum , @StephenFoerster

References

Thank you for your time in reading this post.

Article Labels

Article Tags