Sl No | Name | Purpose | Location |
1 | Git - Clone Git Repo | Clones a repository from a Git platform (like GitHub, GitLab etc.) to a folder in your filesystem. |
|
2 | Git - Delete Local Repo | Deletes the local copy of a Git repository folder in your filesystem |
|
3 | Git - List Local Repo Changes | Lists files with changed status (delete, add, modified) inside a local copy of a Git repository folder |
|
Assume you have a set of SAS programs which define a process. These have been created by an analyst / team of analysts, possibly with each of them contributing to parts of the whole. The code may cover a specific area (such as data engineering / transformations) or be end-to-end. Regardless of how this code is structured or the use-case, you may desire flexibility to run this code anywhere, at any time, in a cloud-native environment. Running your code in a cloud-native environment also implies you are motivated to design your workload in such a way as to minimise costs.
Enter Git-related custom steps. We first start with the question of accessing these steps from within a SAS Viya environment. A recommendation is to follow instructions to upload a selected custom step to SAS Viya. Another alternative is to make use of Git integration functionality already available in SAS Studio, clone the SAS Studio Custom Steps GitHub repository and make a copy of required custom steps in your SAS Content folders. The below animated GIF shows you how.
In this case, the custom steps you need to make available in SAS Content are the ones in the table above (use folder links which have been provided).
1. Open up a new SAS Studio Flow (New -> Flow)
2. For neater organization, create swim lanes which are logical subprocesses within your main flow. Here are some instructions.
3. Within a swim lane, pull in the "Git - Clone Git Repo" custom step.
4. Provide the following input parameters (the About tab of each custom step contains more detailed instructions):
The animated GIF below shows these steps.
As a result of the steps above, you now have your source code (SAS programs and other related content) available within your SAS Studio environment, ready to run. Choose the execution method most convenient for you (run the code standalone, pull them into a flow, or include within other programs).
Here's a suggestion which can make the whole process even more portable : what if, in addition to your source code assets, you also have a SAS Studio Flow which contains your entire source code , linked together in the form of a process, neatly organised within swim lanes as shown in the below GIF? This Flow already has code placeholders which are linked together to a local repository (saved in a predetermined path as dictated by your business process). Successful run of the Git Clone Custom Step at the top of the repository paves the way for subsequent code assets being made available and executable!
An advantage to separating code from the runtime environment is that you can address problems occurring due to stale code. As this article (and other similar articles) points out, applying the wrong process is a cause of analytics projects failing, and updated code becomes even more important when following modern collaborative analytics processes. It's likely that the code / process you ran today may have already been updated on the central Git repository. But, the presence of an earlier version of the code on the Viya environment is highly likely to mislead analysts into thinking that they are running the latest code. While there are different approaches (such as pulling new code) to address this, in ephemeral environments, one option is to remove the source code once it has served its purpose (i.e. finished executing). You may still like to have the log persisted in an alternative location so that you can access it as a matter of record. A planned future Git-related custom step will also facilitate "pulling" / updating existing code repos, so stay tuned for the same.
Acknowledging the dynamism of many analytics programs (especially if they are at early stages of development / production), it may sometimes be necessary for analysts to change parts of the source (SAS) code. In such a case, they may like to use tools which easily capture all changes that have occurred to source code. Within the in-built integration to Git in SAS Studio, this is available through the Git icon in SAS Studio upon selecting a repository, which highlights the files to which changes have been made.
At the programming level, you might like to trigger further downstream decisions based on the changes (an addition of a new file, deletion of a file, or modification of an existing file) made. For this purpose, we make use of the "Git - List Local Repo Changes" custom step. This identifies all changes within the local repo and lists them out in a table. It even goes further and promotes the table to in-memory, allowing it to be visualised as shown in the picture below. Analysts can use this report to make a decision about the changes they would like to make to the source repo. Again, keeping in mind the spirit of community-driven development, you are also encourage to design future custom steps which can push your changes to the central repo.
Using Git-related Custom Steps provides you convenience and portability. Hydrate and execute content within your target Viya environment of choice in an efficient and cost-effective manner. Have fun with these custom steps, feel free to improve upon them, and send an email in case of any questions.
Many thanks to Danny Zimmerman for his help and collaboration on Git integration within SAS Studio. In addition, thanks to the GitHub SAS Studio Custom Steps maintainers community (Wilbram, David, and Mary) for facilitating the process to make these steps available.
Join us for SAS Innovate 2025, our biggest and most exciting global event of the year, in Orlando, FL, from May 6-9. Sign up by March 14 for just $795.
Data Literacy is for all, even absolute beginners. Jump on board with this free e-learning and boost your career prospects.