Git doesn’t need to be hard to understand and use, and in this post, I want to walk you through a practical way of using Git with SAS Studio in SAS Viya. SAS Studio provides a nice user interface to Git that makes working with Git much easier to use than using Git from the command line. No complex syntax to understand, no details to remember – you can simply point and click on what Git commands you want to use. And the information Git needs will be there for you - so no need to type in extra details.
See the screenshot below for a sample view of the Git user interface inside SAS Studio in SAS Viya:
Figure 1: Screenshot of SAS Studio Git user interface
My previous blog - Demystifying Git: A Survey of Data and AI Users on SAS - SAS Support Communities indicates that 59% from our local user community already use Git on at least a weekly basis for their data and AI work. The survey also tells us that 41% use Git less often – which tells us there is a potential to get more users to enjoy the benefits of using Git.
As you can find from this blog - Why Use Git for Data and AI Work in SAS Studio? - SAS Support Communities, there are good reasons to use a version control system like Git to save the day when you accidentally mess up your colleague’s work. Git will help you move back in time before the accident happen and get you back to a state when everything looked good.
In this post I will show you how to use SAS Studio with GitHub – one of the most popular hosted version control systems. SAS Studio also works well with Gitlab, Azure Repos and Bitbucket and quite a few others. And they will work quite similarly to how GitHub works with SAS Studio. Most of these version control systems can also be hosted by your own organization on-prem in case the hosted options are not available.
To be able to connect SAS Studio with GitHub, there are a few prerequisites that needs to be in place:
There are essentially 2 ways of connecting to GitHub from SAS Studio – through SSH or through HTTPS. What to use typically depends on policies in your organization – some may prefer SSH, and some may prefer HTTPS:
Although similar, I find HTTPS using a Personal Access Token more user friendly.
An overview of the process of how to connect SAS Studio with GitHub through HTTPS using a personal access can be illustrated as follows:
Figure 2: Overview of connecting SAS Studio with GitHub
We will go through each of these in more detail, as there are some important details to be aware of. As the Personal Access Token is literally the key to your GitHub, care needs to be made both in GitHub when creating the token and when saving the token in SAS Viya.
For those interested in connecting to GitHub using SSH – there will be a separate blog about that approach.
So let us go through each of the steps above in more detail.
First let us log in to GitHub and create our Personal Access Token (PAT).
Figure 3: Details on creating a Personal Access Token from GitHub
The following explains each point from the illustration:
Now that we have our Personal Access Token, we can store it in My Credentials inside SAS Viya Environment Manager.
Figure 4: Details on storing the Personal Access Token from GitHub in My Credentials inside SAS Viya Environment Manager
The following explains each point from the illustration:
We can now use our Personal Access Token by referencing this AuthDomain when defining the Git Profile in SAS Studio.
Figure 5: Details on defining a Git Profile in SAS Studio
The following explains each point from the illustration:
Now that we have gotten ourselves a Git Profile in SAS Studio, we can use that Git Profile to clone Git repositories to our SAS Studio environment.
Figure 6: Details on cloning a Git repository from GitHub to SAS Studio
The following explains each point from the illustration:
Now that we have cloned our repository from GitHub, we can work with the files and folders in that repository inside SAS Studio.
Figure 7: Details on how to develop and save work from a local Git repository in SAS Studio
The following explains each point from the illustration:
Figure 8: Details on how to stage and commit changes to our local Git repository in SAS Studio
The following explains each point from the illustration:
Now that we have committed our changes to our local Git repository in SAS Studio, we can push those changes back to the remote repository in GitHub that we cloned it from.
Figure 9: Details on how to push committed changes from our local Git Repository back to GitHub
The following explains each point from the illustration:
After all this detailed work, it makes sense to reflect on what we did – so let us review the illustration we started with:
Figure 10: Overview of connecting SAS Studio with GitHub
So, with that – are you ready to use Git efficiently with SAS Studio in SAS Viya?
Questions and other feedback are welcome in the comments.
References used in this post:
Demystifying Git: A Survey of Data and AI Users on SAS - SAS Support Communities
Why Use Git for Data and AI Work in SAS Studio? - SAS Support Communities
Prepare for Git Authentication Changes in SAS Studio on SAS Viya
Whenever I think about learning Git, I read up on it, and find that I don't understand the words used (and there are many of them) and a long and arduous process to get started. So for example, you list 4 pre-requisites and I don't know if I have any of those, nor do I even understand what the words in the prerequisites mean. Could you "de-mistify" this even further?
I think I understand the basic benefit of version control of software.
Hi @PaigeMiller ,
Let me try to explain the prequisites in some more details - in an attempt to provide some guidance - while acknowledging your experience of arduous process... which indeed it is. Mostly due to that there are 2 systems that we want to work together and they kind of work differently and support different purposes.
SAS Studio does analytics and data mgmt really well, where as git does version control really well - and as such has become industry standard.
In SAS Viya, SAS Studio uses git to support version control to lean on an industry standard approach instead of having its own. And there is indeed a "git" language that is somewhat different from the existing sas development language.
The key concepts being used to explain the prerequisites are:
With those concepts explained, the prerequisites might make more sense:
Thanks, this helps me understand why it seems so complicated, but it doesn't really make me want to go ahead and perform these tasks.
@PaigeMiller - There's no doubt there is a big learning curve to understanding Git / GitHub and the concepts are the hardest to get your head around. In my experience however, help is at hand! In my organisation there are many software developers using version control, collaboration and software management tools and I suspect it may be the same for you. So my recommendation would be to seek out the software management experts where you work and learn from them. It is possible that they are already have tools, methodologies and even Git repositories ready for you to use and you just need to be added as a new user. That is how it works for us anyway. In other words, you are not on your own in most organisations when it comes to software / code versioning and management.
At this moment my team has some mixed feeling using git for flow (.flw) in comparison for .sas file. Since .flw are json-files which are by default not in pretty-format it's very difficult/ if not impossible to compare the different version of a flow in Git. One of the idea's we've is to make an package of a pretty made json and the SAS-code generated from it.
I'am interested if someone recognize this?
It is indeed a cool idea - and if you want to tap into the diff functionality of git - you may need to do something like that.
You could automate this in a CICD process for instance - there are REST APIs available to generate the SAS code - and a pretty printer of a json-based flow file I would be interested in myself - you made it?
I would think that the flow file might be challenging to get a meaningful diff out of - as there is quite a bit of content in it. But the generated SAS code could potentially be useful.
At any rate, I would recommend having a proper pull request with a conversation on the changes being made - and some kind of visual comparison of the different versions being contemplated in the pull request. Which in theory could be done by those making the change, showing what changes they did. More manual than a diff - but maybe healthy nonetheless to get such a conversation going? Why would they make a change to the same flow for instance? There are ways of modularizing a flow using subflow that could potentially alleviate the reason for such a conflict.
@SASKiwi great suggestion, thanks! I will look into this.
Don’t miss the livestream kicking off May 7. It’s free. It’s easy. And it’s the best seat in the house.
Join us virtually with our complimentary SAS Innovate Digital Pass. Watch live or on-demand in multiple languages, with translations available to help you get the most out of every session.
The rapid growth of AI technologies is driving an AI skills gap and demand for AI talent. Ready to grow your AI literacy? SAS offers free ways to get started for beginners, business leaders, and analytics professionals of all skill levels. Your future self will thank you.