Git doesn’t need to be hard to understand and use, and in this post, I want to walk you through a practical way of using Git with SAS Studio in SAS Viya. SAS Studio provides a nice user interface to Git that makes working with Git much easier to use than using Git from the command line. No complex syntax to understand, no details to remember – you can simply point and click on what Git commands you want to use. And the information Git needs will be there for you - so no need to type in extra details.
See the screenshot below for a sample view of the Git user interface inside SAS Studio in SAS Viya:
Figure 1: Screenshot of SAS Studio Git user interface
My previous blog - Demystifying Git: A Survey of Data and AI Users on SAS - SAS Support Communities indicates that 59% from our local user community already use Git on at least a weekly basis for their data and AI work. The survey also tells us that 41% use Git less often – which tells us there is a potential to get more users to enjoy the benefits of using Git.
As you can find from this blog - Why Use Git for Data and AI Work in SAS Studio? - SAS Support Communities, there are good reasons to use a version control system like Git to save the day when you accidentally mess up your colleague’s work. Git will help you move back in time before the accident happen and get you back to a state when everything looked good.
In this post I will show you how to use SAS Studio with GitHub – one of the most popular hosted version control systems. SAS Studio also works well with Gitlab, Azure Repos and Bitbucket and quite a few others. And they will work quite similarly to how GitHub works with SAS Studio. Most of these version control systems can also be hosted by your own organization on-prem in case the hosted options are not available.
To be able to connect SAS Studio with GitHub, there are a few prerequisites that needs to be in place:
There are essentially 2 ways of connecting to GitHub from SAS Studio – through SSH or through HTTPS. What to use typically depends on policies in your organization – some may prefer SSH, and some may prefer HTTPS:
Although similar, I find HTTPS using a Personal Access Token more user friendly.
An overview of the process of how to connect SAS Studio with GitHub through HTTPS using a personal access can be illustrated as follows:
Figure 2: Overview of connecting SAS Studio with GitHub
We will go through each of these in more detail, as there are some important details to be aware of. As the Personal Access Token is literally the key to your GitHub, care needs to be made both in GitHub when creating the token and when saving the token in SAS Viya.
For those interested in connecting to GitHub using SSH – there will be a separate blog about that approach.
So let us go through each of the steps above in more detail.
First let us log in to GitHub and create our Personal Access Token (PAT).
Figure 3: Details on creating a Personal Access Token from GitHub
The following explains each point from the illustration:
Now that we have our Personal Access Token, we can store it in My Credentials inside SAS Viya Environment Manager.
Figure 4: Details on storing the Personal Access Token from GitHub in My Credentials inside SAS Viya Environment Manager
The following explains each point from the illustration:
We can now use our Personal Access Token by referencing this AuthDomain when defining the Git Profile in SAS Studio.
Figure 5: Details on defining a Git Profile in SAS Studio
The following explains each point from the illustration:
Now that we have gotten ourselves a Git Profile in SAS Studio, we can use that Git Profile to clone Git repositories to our SAS Studio environment.
Figure 6: Details on cloning a Git repository from GitHub to SAS Studio
The following explains each point from the illustration:
Now that we have cloned our repository from GitHub, we can work with the files and folders in that repository inside SAS Studio.
Figure 7: Details on how to develop and save work from a local Git repository in SAS Studio
The following explains each point from the illustration:
Figure 8: Details on how to stage and commit changes to our local Git repository in SAS Studio
The following explains each point from the illustration:
Now that we have committed our changes to our local Git repository in SAS Studio, we can push those changes back to the remote repository in GitHub that we cloned it from.
Figure 9: Details on how to push committed changes from our local Git Repository back to GitHub
The following explains each point from the illustration:
After all this detailed work, it makes sense to reflect on what we did – so let us review the illustration we started with:
Figure 10: Overview of connecting SAS Studio with GitHub
So, with that – are you ready to use Git efficiently with SAS Studio in SAS Viya?
Questions and other feedback are welcome in the comments.
References used in this post:
Demystifying Git: A Survey of Data and AI Users on SAS - SAS Support Communities
Why Use Git for Data and AI Work in SAS Studio? - SAS Support Communities
Prepare for Git Authentication Changes in SAS Studio on SAS Viya
Whenever I think about learning Git, I read up on it, and find that I don't understand the words used (and there are many of them) and a long and arduous process to get started. So for example, you list 4 pre-requisites and I don't know if I have any of those, nor do I even understand what the words in the prerequisites mean. Could you "de-mistify" this even further?
I think I understand the basic benefit of version control of software.
Hi @PaigeMiller ,
Let me try to explain the prequisites in some more details - in an attempt to provide some guidance - while acknowledging your experience of arduous process... which indeed it is. Mostly due to that there are 2 systems that we want to work together and they kind of work differently and support different purposes.
SAS Studio does analytics and data mgmt really well, where as git does version control really well - and as such has become industry standard.
In SAS Viya, SAS Studio uses git to support version control to lean on an industry standard approach instead of having its own. And there is indeed a "git" language that is somewhat different from the existing sas development language.
The key concepts being used to explain the prerequisites are:
With those concepts explained, the prerequisites might make more sense:
Thanks, this helps me understand why it seems so complicated, but it doesn't really make me want to go ahead and perform these tasks.
@PaigeMiller - There's no doubt there is a big learning curve to understanding Git / GitHub and the concepts are the hardest to get your head around. In my experience however, help is at hand! In my organisation there are many software developers using version control, collaboration and software management tools and I suspect it may be the same for you. So my recommendation would be to seek out the software management experts where you work and learn from them. It is possible that they are already have tools, methodologies and even Git repositories ready for you to use and you just need to be added as a new user. That is how it works for us anyway. In other words, you are not on your own in most organisations when it comes to software / code versioning and management.
At this moment my team has some mixed feeling using git for flow (.flw) in comparison for .sas file. Since .flw are json-files which are by default not in pretty-format it's very difficult/ if not impossible to compare the different version of a flow in Git. One of the idea's we've is to make an package of a pretty made json and the SAS-code generated from it.
I'am interested if someone recognize this?
It is indeed a cool idea - and if you want to tap into the diff functionality of git - you may need to do something like that.
You could automate this in a CICD process for instance - there are REST APIs available to generate the SAS code - and a pretty printer of a json-based flow file I would be interested in myself - you made it?
I would think that the flow file might be challenging to get a meaningful diff out of - as there is quite a bit of content in it. But the generated SAS code could potentially be useful.
At any rate, I would recommend having a proper pull request with a conversation on the changes being made - and some kind of visual comparison of the different versions being contemplated in the pull request. Which in theory could be done by those making the change, showing what changes they did. More manual than a diff - but maybe healthy nonetheless to get such a conversation going? Why would they make a change to the same flow for instance? There are ways of modularizing a flow using subflow that could potentially alleviate the reason for such a conflict.
@SASKiwi great suggestion, thanks! I will look into this.
Thanks for this article.
One fundamental issue is that SAS 9.4 SAS Studio will not work with local repositories. It requires that all GIT repositories have some sort of external repository associated. GIT functions in SAS make it optional to make a repository with a remote repository and SAS Viya SAS Studio honors local repositories without issue.
People using local repositories in SAS 9.4 are forced to sit on the sidelines for this.
@epower - We use SAS 9.4 and Enterprise Guide with local repositories. These are managed by MS Visual Studio, totally independent of SAS. Using MS VS Solution Explorer you can double-click on a program to open it in EG. Saving a program in EG updates it in the local repository. Alternatively if your local repo is available via a Windows share, opening the program into EG via Windows Explorer is also possible.
Yes there many ways of interacting with GIT. Sadly SAS Studio with 9.4 does not.
Great article @larsarne! I think you highlighted everything really well.
@epower I apologize for the limitations for SASStudio 3.x and 9.4 with regards of the use of local only repositories. The git functionality was quite new during that time and we were unable to get all of the functionality we wanted to into SASStudio 3.x. That being said, there is a way to get you off of the sidelines and use local only repositories in SASStudio 3.x and SAS 9.4.
You can utilize the SAS git functions to accomplish this.
Open a SAS program tab within SASStudio and enter the following code (you'll need to edit the parameters):
data _null_; rc= GIT_INIT_REPO("<pathOfFolderYouWantToBeALocalRepo>", "<initialBranchName>", "<remoteURLParameter>"); run;
The last parameter is optional but in this case, you'll need to enter something (can be anything/random text) in order to be able to register the repository with SASStudio. The limitation you're running into is the inability to use the UI to initialize a directory as a local repository and the ability to open a local only repository (a repo without a remote set).
Once you have initialized the local repository, you can navigate to the git pane within SASStudio, click on the dropdown and select "Open a Local Repository". Find your local repo within the file system and select OK.
Now you can use your local repo without having an actual remote repository.
If you decide in the future to set a remote repository URL for your local repo, there is another function for that:
data _null_; rc = GIT_SET_URL("<localRepoPath>", "<remoteRepoURL>"); run;
Let me know if you have any questions. Always happy to help.
Danny
Good news: We've extended SAS Hackathon registration until Sept. 12, so you still have time to be part of our biggest event yet – our five-year anniversary!
The rapid growth of AI technologies is driving an AI skills gap and demand for AI talent. Ready to grow your AI literacy? SAS offers free ways to get started for beginners, business leaders, and analytics professionals of all skill levels. Your future self will thank you.