Demystifying Git – Using SAS Studio in SAS Viya with GitHub through HTTPS
- Article History
- RSS Feed
- Mark as New
- Mark as Read
- Bookmark
- Subscribe
- Printer Friendly Page
- Report Inappropriate Content
Git doesn’t need to be hard to understand and use, and in this post, I want to walk you through a practical way of using Git with SAS Studio in SAS Viya. SAS Studio provides a nice user interface to Git that makes working with Git much easier to use than using Git from the command line. No complex syntax to understand, no details to remember – you can simply point and click on what Git commands you want to use. And the information Git needs will be there for you - so no need to type in extra details.
See the screenshot below for a sample view of the Git user interface inside SAS Studio in SAS Viya:
Figure 1: Screenshot of SAS Studio Git user interface
My previous blog - Demystifying Git: A Survey of Data and AI Users on SAS - SAS Support Communities indicates that 59% from our local user community already use Git on at least a weekly basis for their data and AI work. The survey also tells us that 41% use Git less often – which tells us there is a potential to get more users to enjoy the benefits of using Git.
As you can find from this blog - Why Use Git for Data and AI Work in SAS Studio? - SAS Support Communities, there are good reasons to use a version control system like Git to save the day when you accidentally mess up your colleague’s work. Git will help you move back in time before the accident happen and get you back to a state when everything looked good.
In this post I will show you how to use SAS Studio with GitHub – one of the most popular hosted version control systems. SAS Studio also works well with Gitlab, Azure Repos and Bitbucket and quite a few others. And they will work quite similarly to how GitHub works with SAS Studio. Most of these version control systems can also be hosted by your own organization on-prem in case the hosted options are not available.
Prerequisites
To be able to connect SAS Studio with GitHub, there are a few prerequisites that needs to be in place:
- A managed version control system – GitHub for instance, other systems work also, like GitLab, BitBucket, Azure Devops and quite a few others. Most of these also provides on-prem installations if there is a need to set these up inside your own network.
- SAS Studio enabled for Git integration – and integration through HTTPS if you want to connect through HTTPS, SSH Git integration if you want to connect through SSH
- An available authDomain where you can store a personal access token to authorize access to Git on your behalf – this was a change in SAS Viya from the 2024.12 release – see Prepare for Git Authentication Changes in SAS Studio on SAS Viya. If you don’t have any or don’t find any, ask your SAS Administrator to create one for you.
- Your personal Home folder on a file system mounted to SAS Studio. Although possible to use a commonly available file system, Git will be less able to properly track your changes when using a commonly shared folder.
Connecting through SSH or through HTTPS with Personal Access Token
There are essentially 2 ways of connecting to GitHub from SAS Studio – through SSH or through HTTPS. What to use typically depends on policies in your organization – some may prefer SSH, and some may prefer HTTPS:
- Using SSH involves a process of creating private and public keys and making those keys available to SAS Studio.
- Using HTTPS involves a process of creating a Personal Access Token in GitHub and making that token available to SAS Studio.
Although similar, I find HTTPS using a Personal Access Token more user friendly.
An overview of the process of how to connect SAS Studio with GitHub through HTTPS using a personal access can be illustrated as follows:
Figure 2: Overview of connecting SAS Studio with GitHub
- Generate Personal Access Token (PAT) in Github
- Enter SAS Environment Manager and store this PAT in My Credentials in an available authDomain
- Enter SAS Studio and Define a Git profile with HTTPS based on this authDomain (that holds the PAT) – use the userid and email that you have on your GitHub profile
- In SAS Studio, clone a git repository from GitHub based on this Git Profile
- Now you can work as usual in SAS Studio on the files you have access to in this cloned Git Repository. As this is cloned – you are now working on the local Git repository. Through the Git user interface in SAS Studio, you can stage the changed files if you are ready to commit these changes to your local repository. Commit with a meaningful commit message when are ready to do so.
- To get these committed changes back to GitHub, you need to Push the changes – and the Git user interface in SAS Studio helps you do that.
We will go through each of these in more detail, as there are some important details to be aware of. As the Personal Access Token is literally the key to your GitHub, care needs to be made both in GitHub when creating the token and when saving the token in SAS Viya.
For those interested in connecting to GitHub using SSH – there will be a separate blog about that approach.
So let us go through each of the steps above in more detail.
First let us log in to GitHub and create our Personal Access Token (PAT).
1. First time Git connection – creating the Personal Access Token in GitHub
Figure 3: Details on creating a Personal Access Token from GitHub
The following explains each point from the illustration:
- Enter profile by clicking on your user image on the upper right corner
- Enter Settings
- Enter Developer settings in the menu that comes up
- Enter Personal access tokens (classic)
- Choose generate new token (classic)
- Specify name, expiration (duration) and repo scope – nothing else
- Hit generate new token at bottom of page
- Copy token
- Configure SSO if you need to authorize access to your organization’s repositories
- Hit the “Authorize” button to finalize authorizing access to the organization you want to authorize SSO into
Now that we have our Personal Access Token, we can store it in My Credentials inside SAS Viya Environment Manager.
2. Saving the Personal Access Token in Environment Manager
Figure 4: Details on storing the Personal Access Token from GitHub in My Credentials inside SAS Viya Environment Manager
The following explains each point from the illustration:
- Enter My Credentials in Environment Manager
- Hit new credentials (asterisk)
- Pick an authDomain that is available – ask your SAS admin if you are unsure which to use or if you don’t find any, DO NOT use defaultAuth – it is used internally in SAS Viya
- Enter a userId for this credential. Use your full name - this is case sensitive, so type it in exactly as you want it – you might as well use the one you have in GitHub. When setting up the Git Profile in SAS Studio on the next page – SAS Studio will match against this UserId.
- Enter the personal access token (PAT) you created in GitHub as the password.
We can now use our Personal Access Token by referencing this AuthDomain when defining the Git Profile in SAS Studio.
3. Define Git Profile in SAS Studio
Figure 5: Details on defining a Git Profile in SAS Studio
The following explains each point from the illustration:
- Under options menu, choose Manage Git Connections
- In the manage Git Connections window, choose Profiles
- Hit ‘+’ to add a git profile
- In the add profile window, choose the https option
- Give the profile a meaningful name – for instance My GitHub
- As the commit author name, enter the userId you stored in My Credentials (previous page)– exactly as written, this is case sensitive and will be matched against the entry with that userId
- As the commit author email, enter the email from your GitHub profile
- Select the authDomain where you stored the GitHub Personal Access Token (previous page)
- Hit ‘OK’ to save the new Git profile in SAS Studio
Now that we have gotten ourselves a Git Profile in SAS Studio, we can use that Git Profile to clone Git repositories to our SAS Studio environment.
4. Clone Git repository from GitHub
Figure 6: Details on cloning a Git repository from GitHub to SAS Studio
The following explains each point from the illustration:
- In the ‘Manage Git Connections’ window, choose the Repositories page
- Hit the ‘+’ button to Clone a repository
- Enter GitHub in another browser tab, pick the repository you want to clone and hit the Code button (the green one)
- Select the HTTPS option
- Copy the link to the GitHub repository by hitting this button
- Paste the link to the GitHub repository you want to clone in this field
- In your home folder, create the folder you want to clone the repository to – you will not be able to clone into SAS Content as this is not a file system
- Name the folder for your local Git repository – the same name as the repository you are cloning from is a good practice - select the folder you created to get that back into the clone repository screen
- Select the Git profile you created in the previous page
- Hit ‘Clone’ and the GitHub repository will be cloned into the folder you specified
Now that we have cloned our repository from GitHub, we can work with the files and folders in that repository inside SAS Studio.
5. Use Git from SAS Studio – develop and save work
Figure 7: Details on how to develop and save work from a local Git repository in SAS Studio
The following explains each point from the illustration:
- The Git icon at the left of the folder indicates this is a Git managed folder in SAS Studio
- This button provides a list of Git repositories to work with inside SAS Studio
- You can now work in SAS Studio on files in this Git managed repository, save that work and Git will detect what files have changed
- This tab holds a user-friendly interface to work with Git – let us look at that in our next section.
5.4 Use Git in SAS Studio to stage and commit to our local repository
Figure 8: Details on how to stage and commit changes to our local Git repository in SAS Studio
The following explains each point from the illustration:
- The user interface to Git in SAS Studio provides a list of files that Git has detected changes on locally inside SAS Studio
- This button performs a “stage” on all changed files – you can also stage files individually. Staging means we are telling Git that the changes to these files are ready to be committed to our local repository inside SAS Studio
- Once we are ready to commit these staged changes to our local repository – we can enter a meaningful commit message and hit the commit button.
Now that we have committed our changes to our local Git repository in SAS Studio, we can push those changes back to the remote repository in GitHub that we cloned it from.
6. Push back to our remote repository in GitHub
Figure 9: Details on how to push committed changes from our local Git Repository back to GitHub
The following explains each point from the illustration:
- As these changes are only committed to our local repository inside SAS Studio, we need to hit the push button to push these changes back to the repository we cloned it from.
- Notice the commit message now being visible in our remote GitHub repository which shows that these changes are now available in the remote GitHub repository as well
Summary – what we just did
After all this detailed work, it makes sense to reflect on what we did – so let us review the illustration we started with:
Figure 10: Overview of connecting SAS Studio with GitHub
- First we generated our Personal Access Token (PAT) in GitHub.
- Then we stored this PAT in My Credentials in an available authDomain using SAS Viya Environment Manager.
- Then we defined a Git profile in SAS Studio with HTTPS based on this authDomain (that holds the PAT) – and we used the userid and email that we have on our GitHub profile
- Then we cloned a Git repository from GitHub based on this Git Profile in SAS Studio
- At this point, we are ready to work as usual in SAS Studio on the files we have access to in this cloned Git Repository. As this is cloned – we are now working on the local Git repository. Through the Git user interface in SAS Studio, we can stage the changed files when we are ready to commit these changes to our local repository. Commit with a meaningful commit message when ready to do so.
- To get these committed changes back to GitHub, we need to Push the changes – and the Git user interface in SAS Studio helps you do that.
So, with that – are you ready to use Git efficiently with SAS Studio in SAS Viya?
Questions and other feedback are welcome in the comments.
References used in this post:
Demystifying Git: A Survey of Data and AI Users on SAS - SAS Support Communities
Why Use Git for Data and AI Work in SAS Studio? - SAS Support Communities
Prepare for Git Authentication Changes in SAS Studio on SAS Viya
- Mark as Read
- Mark as New
- Bookmark
- Permalink
- Report Inappropriate Content
Whenever I think about learning Git, I read up on it, and find that I don't understand the words used (and there are many of them) and a long and arduous process to get started. So for example, you list 4 pre-requisites and I don't know if I have any of those, nor do I even understand what the words in the prerequisites mean. Could you "de-mistify" this even further?
I think I understand the basic benefit of version control of software.
- Mark as Read
- Mark as New
- Bookmark
- Permalink
- Report Inappropriate Content
Hi @PaigeMiller ,
Let me try to explain the prequisites in some more details - in an attempt to provide some guidance - while acknowledging your experience of arduous process... which indeed it is. Mostly due to that there are 2 systems that we want to work together and they kind of work differently and support different purposes.
SAS Studio does analytics and data mgmt really well, where as git does version control really well - and as such has become industry standard.
In SAS Viya, SAS Studio uses git to support version control to lean on an industry standard approach instead of having its own. And there is indeed a "git" language that is somewhat different from the existing sas development language.
The key concepts being used to explain the prerequisites are:
- Repository - related content that are kept together in a folder-based file structure
- Version control system - we had these in the past also - and for most practical purposes these have now converged into a git based systems - as Git has become an industry standard for providing version control of source code
- A vault or a safe place to store credentials - in SAS Viya we call it AuthDomains, on Azure they are called vault - which is a similar construct
- A file system made available to SAS Studio - so that Git can work with it
With those concepts explained, the prerequisites might make more sense:
- The purpose of a managed version control system; like GitHub is to provide a central location to keep repositories. GitHub is the most popular one out there - and it is likely that GitHub contributed to the success of Git - as it came out shortly after. Other systems like GitLab, BitBucket, Azure Devops and quite a few others work quite similar and are also based on Git. Most of these also provides on-prem installations if there is a need to set these up inside your own network.
- SAS Studio in SAS Viya is not always enabled for Git integration as a setting need to be turned on inside SAS Viya for Git to be enabled. In which case it needs to be enabled for Git integration. And as there are 2 ways to connect - either through HTTPS (web) or through SSH (secured shell) - SAS Viya also need to be configured for one or the other - or both. The person to configure these things, would need the necessary privileges to do so.
- SAS Viya need to have credentials stored in a safe place, and calls such storage for an authDomain. The personal access token you generated through the process described here needs to be stored in an authDomain, so that SAS Studio can authorize access to Git on your behalf. This was a change in SAS Viya from the 2024.12 release – see Prepare for Git Authentication Changes in SAS Studio on SAS Viya. If you don’t have any or don’t find any, ask your SAS Administrator to create one for you.
- Git can only work with file systems, so SAS Studio needs to be have a file system available for Git to work. And to work well with Git, you want to ensure that what Git is managing on your behalf are files and folders that only you have access to - so that Git can properly track who is making changes to files and folders that Git is managing for you. Which is why I am recommending that you use your personal Home folder on a file system mounted to SAS Studio. This may not always be available as personal folders needs to be configured for such use - which is something your admin would typically set up. Thus, if a personal folder is not available, you can still use a commonly available file system. Which will work, but there is a chance that others might make changes in those files, in which case, Git will not be able to track who made that change.
- Mark as Read
- Mark as New
- Bookmark
- Permalink
- Report Inappropriate Content
Thanks, this helps me understand why it seems so complicated, but it doesn't really make me want to go ahead and perform these tasks.
- Mark as Read
- Mark as New
- Bookmark
- Permalink
- Report Inappropriate Content
@PaigeMiller - There's no doubt there is a big learning curve to understanding Git / GitHub and the concepts are the hardest to get your head around. In my experience however, help is at hand! In my organisation there are many software developers using version control, collaboration and software management tools and I suspect it may be the same for you. So my recommendation would be to seek out the software management experts where you work and learn from them. It is possible that they are already have tools, methodologies and even Git repositories ready for you to use and you just need to be added as a new user. That is how it works for us anyway. In other words, you are not on your own in most organisations when it comes to software / code versioning and management.
- Mark as Read
- Mark as New
- Bookmark
- Permalink
- Report Inappropriate Content
At this moment my team has some mixed feeling using git for flow (.flw) in comparison for .sas file. Since .flw are json-files which are by default not in pretty-format it's very difficult/ if not impossible to compare the different version of a flow in Git. One of the idea's we've is to make an package of a pretty made json and the SAS-code generated from it.
I'am interested if someone recognize this?
- Mark as Read
- Mark as New
- Bookmark
- Permalink
- Report Inappropriate Content
It is indeed a cool idea - and if you want to tap into the diff functionality of git - you may need to do something like that.
You could automate this in a CICD process for instance - there are REST APIs available to generate the SAS code - and a pretty printer of a json-based flow file I would be interested in myself - you made it?
I would think that the flow file might be challenging to get a meaningful diff out of - as there is quite a bit of content in it. But the generated SAS code could potentially be useful.
At any rate, I would recommend having a proper pull request with a conversation on the changes being made - and some kind of visual comparison of the different versions being contemplated in the pull request. Which in theory could be done by those making the change, showing what changes they did. More manual than a diff - but maybe healthy nonetheless to get such a conversation going? Why would they make a change to the same flow for instance? There are ways of modularizing a flow using subflow that could potentially alleviate the reason for such a conflict.
- Mark as Read
- Mark as New
- Bookmark
- Permalink
- Report Inappropriate Content
@SASKiwi great suggestion, thanks! I will look into this.