BookmarkSubscribeRSS Feed

Git Set N Go : SAS Studio Custom Steps for Git integration

Started ‎02-13-2023 by
Modified ‎02-17-2023 by
Views 1,235
Git repositories facilitate collaborative code development.  Analytics practitioners benefit from separating their codebase from the environment, making it easier to manage, govern, execute and recover SAS code.  A centralised repository also aids effective cost management for cloud-native analytics, allowing you to utilise Viya services only for the time you need to use this code.
 
SAS Viya already offers Git integration capabilities through Git integration in SAS Studio and Git functions.  Other SAS Viya applications also integrate with Git, but the objective of this article is to inform about recent developments which surface Git capabilities using SAS Studio Custom Steps, some of which are available on a community-driven GitHub repository.  SAS Studio Custom Steps are low-code components which can run inside a SAS Studio Flow (or standalone) and execute data engineering or analytics processes.  These custom steps further enable analytics developers to embed Git-related instructions and necessary configuration within a SAS Content object (Custom Step or Flow) instead of within a specific SAS Viya environment's configuration, providing an additional layer of portability to their code.  

Benefits :  
 
1. Faster Time to Value: You now have flexibility to port your analytics across different Viya environments.
 
2. Lower Costs: When you save everything as code, you are less dependent on your SAS Viya environment and the need to keep it always available.
 
3. Reduced Risk: Separating code from the analytics platform provides resilience and business continuity in the event of failure.  
 
Join me in this walkthrough of some Git-related custom steps recently made available through the SAS Studio Custom Steps GitHub repository.As some of these steps were recently developed, the list below should be considered as initial, and is set to grow further, as more community members join in & add additional assets.  Bookmark this article to capture future updates!
 

Git-related Custom Steps

 

Sl No Name Purpose Location
1 Git - Clone Git Repo Clones a repository from a Git platform (like GitHub, GitLab etc.) to a folder in your filesystem.

README

 

Repo folder

2 Git - Delete Local Repo Deletes the local copy of a Git repository folder in your filesystem

README

 

Repo folder

3 Git - List Local Repo Changes  Lists files with changed status (delete, add, modified) inside a local copy of a Git repository folder

README 

 

Repo folder 

 

 

Git-related Custom Steps in Operation: Basic Scenario

Assume you have a set of SAS programs which define a process.  These have been created by an analyst / team of analysts, possibly with each of them contributing to parts of the whole.  The code may cover a specific area (such as data engineering / transformations) or be end-to-end.  Regardless of how this code is structured or the use-case,  you may desire flexibility to run this code anywhere, at any time, in a cloud-native environment.  Running your code in a cloud-native environment also implies you are motivated to design your workload in such a way as to minimise costs.

 

Enter Git-related custom steps.  We first start with the question of accessing these steps from within a SAS Viya environment.  A recommendation is to follow instructions to upload a selected custom step to SAS Viya.  Another alternative is to make use of Git integration functionality already available in SAS Studio, clone the SAS Studio Custom Steps GitHub repository and make a copy of required custom steps in your SAS Content folders.  The below animated GIF shows you how.

 

Custom-Step-Git-Integration.gif

 

In this case, the custom steps you need to make available in SAS Content are the ones in the table above (use folder links which have been provided).

 

Clone your SAS Code Repository

 

1. Open up a new SAS Studio Flow (New -> Flow)

2. For neater organization, create swim lanes which are logical subprocesses within your main flow.  Here are some instructions.

3. Within a swim lane, pull in the "Git - Clone Git Repo" custom step.

4. Provide the following input parameters (the About tab of each custom step contains more detailed instructions):

  • Git repo address which contains your source code
  • Destination folder (ensure this is empty) on your filesystem which will hold your local repository
  • The path to your public and private SSH key files.  Ensure that these key files are already loaded in the filesystem, within a folder with desired access permissions.
  • The SSH username and password, if applicable (otherwise, leave as default)

The animated GIF below shows these steps.  

 

clone-a-git-repo.gif

 

 

Execute your SAS Code

 

As a result of the steps above, you now have your source code (SAS programs and other related content) available within your SAS Studio environment, ready to run. Choose the execution method most convenient for you (run the code standalone, pull them into a flow, or include within other programs).

 

Here's a suggestion which can make the whole process even more portable : what if, in addition to your source code assets, you also have a SAS Studio Flow which contains your entire source code , linked together in the form of a process, neatly organised within swim lanes as shown in the below GIF?  This Flow already has code placeholders which are linked together to a local repository (saved in a predetermined path as dictated by your business process).  Successful run of the Git Clone Custom Step at the top of the repository paves the way for subsequent code assets being made available and executable!

 

Screenshot 2023-02-13 at 9.34.52 AM.png

 

Leave No Trace Behind!

 

An advantage to separating code from the runtime environment is that you can address problems occurring due to stale code.  As this article (and other similar articles) points out, applying the wrong process is a cause of analytics projects failing, and updated code becomes even more important when following modern collaborative analytics processes.  It's likely that the code / process you ran today may have already been updated on the central Git repository. But, the presence of an earlier version of the code on the Viya environment is highly likely to mislead analysts into thinking that they are running the latest code.  While there are different approaches (such as pulling new code) to address this, in ephemeral environments, one option is to remove the source code once it has served its purpose (i.e. finished executing).  You may still like to have the log persisted in an alternative location so that you can access it as a matter of record.   A planned future Git-related custom step will also facilitate  "pulling" / updating existing code repos, so stay tuned for the same.

 

Screenshot 2023-02-13 at 10.13.20 AM.png

 

Listing Changes to Local Repo

Acknowledging the dynamism of many analytics programs (especially if they are at early stages of development / production), it may sometimes be necessary for analysts to change parts of the source (SAS) code.  In such a case, they may like to use tools which easily capture all changes that have occurred to source code.  Within the in-built integration to Git in SAS Studio, this is available through the Git icon in SAS Studio upon selecting a repository, which highlights the files to which changes have been made.  

 

At the programming level, you might like to trigger further downstream decisions based on the changes (an addition of a new file, deletion of a file, or modification of an existing file) made.  For this purpose, we make use of the "Git - List Local Repo Changes" custom step.  This identifies all changes within the local repo and lists them out in a table.  It even goes further and promotes the table to in-memory, allowing it to be visualised as shown in the picture below.  Analysts can use this report to make a decision about the changes they would like to make to the source repo.  Again, keeping in mind the spirit of community-driven development, you are also encourage to design future custom steps which can push your changes  to the central repo. 

 

 

git-list-local-repo-changes-general-idea.gif

 

file-change-status-report.png

In Summary

 

Using Git-related Custom Steps provides you convenience and portability.  Hydrate and execute content within your target Viya environment of choice in an efficient and cost-effective manner.  Have fun with these custom steps, feel free to improve upon them, and send an email in case of any questions.

 

Acknowledgements

 

Many thanks to Danny Zimmerman for his help and collaboration on Git integration within SAS Studio.  In addition, thanks to the GitHub SAS Studio Custom Steps maintainers community (Wilbram, David, and Mary) for facilitating the process to make these steps available.

Version history
Last update:
‎02-17-2023 12:03 PM
Updated by:
Contributors

sas-innovate-wordmark-2025-midnight.png

Register Today!

Join us for SAS Innovate 2025, our biggest and most exciting global event of the year, in Orlando, FL, from May 6-9. Sign up by March 14 for just $795.


Register now!

Free course: Data Literacy Essentials

Data Literacy is for all, even absolute beginners. Jump on board with this free e-learning  and boost your career prospects.

Get Started

Article Tags