BookmarkSubscribeRSS Feed

SAS DI Developers: Unite! The new GIT plug-in in Data Integration Studio

Started ‎04-09-2019 by
Modified ‎08-28-2020 by
Views 24,271

With the release of DI Studio 4.904 in the SAS release 9.4M6, there is a new Git plug-in. There have been numerous customer requests to add Git version control system next to the already supported version control systems: CVS and SVN.

 

I will demonstrate how to work with the DI Studio plug-in, on several use cases. I chose to work with GitHub. 

Reminder: Git is a revision control system, a tool to manage your source code history. GitHub is a hosting service for Git repositories. They are not the same thing: Git is the tool, GitHub is the service for projects that use Git.

 

Why GitHub? A few GitHub facts:

  • GitHub is the most popular code sharing platform
  • GitHub is a web-based hosting service for version control using Git
  • Has 18 million users and over 37 million projects

How to use DI with Git

 

Use cases:

Watch this video to see a quick demonstration of the use cases:

 

 

You need to complete the prerequisites to be able to work with GitHub and the Git plug-in.

Advantages using the Git plug-in:

  • Work collaboratively
  • Version your SAS DI jobs
  • Share versions with other developers
  • Restore previous versions, restore jobs deleted by mistake

 

Create a new DI job

Scenario: Start a new DI job, then make it available to other developers.

 

Open DI, create a new job, drop a user-written node on the canvas. Go to the Code tab and write the following:

 

proc setinit; /*version A*/; quit;

The code is not as important as the functionality of the plug-in.

 

1.1-Create-DI-Job-user-written-code-1024x651.png

 

Select any image to see a larger version.
Mobile users: If you do not see this image, scroll to the bottom of the page and select the "Full" version of this post.

 

Press OK. Save the job.

 

In folders, find your job, right click on the job and Archive as a SAS Package:

 

1.2-Create-DI-Job-archive-as-SAS-package-178x300 (1).png

 

Enter the name of the archive and the description (comment):

 

1.3-Create-DI-Job-archive-as-SAS-package-and-push-to-git-1024x527.png  

 

When you press OK, the first time you work with the plug-in you will be asked for your GitHub credentials:

 

1.4-Create-DI-Job-push-to-github-credentials-1024x688.png

 

When the operation has completed successfully go to your GitHub repository:

 

1.5-Create-DI-Job-check-github-repository-1024x542.png

 

The job List_Job was pushed in the GitHub repository. GitHub plays the role of a central archive repository. The job is archived as a package (.spk) before being sent to GitHub with the comment given.

 

Modify a DI job

Scenario 1: You need to modify a job already published by you on GitHub.

 

In DI, open the job and change the code inside the job.

 

2.-Modify-the-DI-job.png

 

Save the Job.

 

In folders, find your job, right click on the job and Archive as a SAS Package.

 

1.2-Create-DI-Job-archive-as-SAS-package-178x300 (1).png

 

Give the Archive the same name List_Job. Insert version B this time in the comments:

 

2.1-Modify-the-DI-job-archive-with-git.png

 

Check on GitHub: the job in the repository has the description updated to ‘version B’. The archive contains the new code. Archives.xml is tracing the changes.

 

2.2-Modify-the-DI-job-check-github-repository-1024x550.png

 

 

Scenario 2: You need to modify a job already published on GitHub by another developer. You need to:

  1. Delete the job from the SAS folders
  2. Initialize the repository
  3. Restore the last job version

 

Compare versions

Scenario: you need to compare the changes in two versions of the same DI job. E.g.: version A vs version B.

 

In Folders, find your job, right click on the job and click Archived SAS Packages:

 

3.-Compare-versions-archived-SAS-packages-170x300.png

 

3.1-Compare-versions-archived-SAS-packages-1024x491.png  

 

Press Compare To… and scroll to examine the changes:

 

3.2-Compare-versions-1024x359.png

 

The DI job is stored as a xml file. When you compare, the xml of the selected version (A in the example) will be checked against the xml of the current job in your SAS folders.

 

Restore a DI job

Scenario: you need to decommission a job and the archived versions or, you need to refresh the jobs from the GitHub repository.

 

If you delete the job from the folders, this will not be automatically reflected in the GitHub repository:

 

4.-Delete-a-DI-job.png

 

First, you can restore the deleted job from the Archived SAS Packages. You can choose a version if you have several:

 

4.2-Delete-a-DI-job-restore-1024x687.png

 

The job will be restored in its original folder:

 

4.2-Delete-a-DI-job-restored-job-from-github.png

 

Delete Permanently

 

If you delete the Archived packages one by one:

 

4.3-Delete-a-DI-job-from-github.png

 

4.4-Delete-a-DI-job-from-github.png

 

the job will also be removed from the GitHub repository:

 

4.6-Delete-DI-Job-from-github-repository-1024x512.png

 

Prerequisites using the Git Plug-In

 

You need to complete the following steps before trying the use cases listed above:

 

  1. Have the right version. Assumptions: You upgraded your SAS platform to release 9.4M6 and you have DI Studio 4.904 installed.
  2. Remove CVS and SVN plug-in from SAS Home folders
  3. Have a GitHub user and password or create a GitHub account
  4. Install GIT for Windows on the same machine as SAS DI
  5. Have access to folders where the GitHub repository will be cloned
  6. Configure the Git plug-in in DI

Remove CVS and SVN plug-in

 

This is a one-off configuration item. When you open DI, there will be three options for a version control system: CVS, SVN and Git Plug-in. Only one version control system can be enabled at a time. To enable the Git plug-in, you must remove the plug-in folders for CVS and SVN. Go to the DI Studio install location, in SAS Home folder:

 

...\SASHome\SASDataIntegrationStudio\4.7\plugins

5.2.1-remove-cvs-and-svn.png

 

Find the two folders below. I recommend you move them to another folder before removal (in the event you would like to re-enable CVS or SVN later):

 

sas.dbuilder.versioncontrol.cvsplug-in
sas.dbuilder.versioncontrol.svnplug-in

You also must go to

…\SASHome\SASVersionedJarRepository\eclipse\plugins\

5.2.2-remove-cvs-and-svn.png

 

and move the folders prefixed by:

 

sas.dbuilder.versioncontrol.cvsplug-in
sas.dbuilder.versioncontrol.svnplug-in

 

As a result, you should only have the folders prefixed:

 

 sas.dbuilder.versioncontrol.git….

present in the folder

 

5.2.3-git-folders.png

 

In some cases, you will not be allowed to move the folders. You then need to stop the services (in the right order, of course): e.g.: Run stopALLServices.bat and then restart Services in the opposite order / e.g.: Run startAllServices.bat

 

Create a GitHub Account

Go to GitHub and create an account (or connect to your GitHub repository). In this example, I set up a repository called ‘SASprograms’

 

5.3.1-create-github-repository-1024x354.png

 

Customize the Files in the GitHub Repository

 

Create Archives.xml

 

5.3.2-create-github-repository-archives.xml_.png

 

Add the following lines in the file (extremely important for the right working of the plug-in)

 

<?xml version="1.0" encoding="UTF-8"?><ROOT></ROOT>

5.3.3-create-github-repository-archives.xml_.png

 

Create Readme.MD - optional

 

5.3.4-create-github-repository-readme.md_.png

 

Checkpoint

 

Your GitHub repository should look like this:

 

5.3.5-create-github-repository-checkpoint.png

 

Install Git for Windows

You need to install free Git for windows. You will need the git.exe to have your Git plug-in working in DI and GIT BASH for the SSH configuration

 

. 5.4-install-git-for-windows-1024x687.png

 

Have access to folders where the GitHub repository will be cloned

e.g.: C:\temp will be used in this example

 

Configure the Git plug-in in Data Integration Studio

You can configure the communication over SSH or HTTPS. We will focus on HTTPS only:

 

Initialize the Repository

Open SAS Data Integration Studio. Go to Tools > Options

 

5.6-DI-initialize-git-repository.png

 

Fill in the parameters:

 

Git Repository URL: This is where you will enter the URL of the remote repository you want to connect to.  You can copy and paste the value from the remote Git repository.  Shown in the example below:

 

5.6-DI-initialize-git-repository-url.png

 

Git Repository URL, e.g.:: https://github.com/bteleuca/SASprograms.git

 

Location of the Git Executable:  This is the path to the git executable file. The default install location: C:\Program Files\Git\bin\git.exe

 

Location of git repository: a folder of your choice e.g.: C:\TEMP

 

Connection Type: HTTPS

 

Press the Initialize Repository button to pull remote files from git.

 

The first time, the user will be prompted for a user id and password.  This is the GitHub user and pass. The User id and password will be stored securely on the local machine by the Windows Credential Manager.

 

5.6.3-DI-initialize-git-repository.png

 

 

The local repository is initialized with the files from GitHub

 

5.6.4-DI-initialize-git-repository-check-300x96.png

 

Hey, you did! You configured the Git plug-in!

 

Credits and References

Thank you Eric Waldbauer and Chuck Bass

References: SAS DI 4.904 user guide - search for git

Explore further

DevOps applied to SAS 9

Git commands

DevOps applied to Viya 3.5

Conclusion

You now hopefully have a sound understanding on how to use the SAS DI 4.904 Git plug-in and how to manage your SAS packages on GitHub, the most popular code sharing platform.

 

The use cases highlight how SAS DI developers can work collaboratively, version jobs, share work with other developers via GitHub, restore previous versions and delete them.

 

If you have tried working with the plug-in, please share your experiences via the comment box.

Comments

Hi ,

 

Thanks for the nice tutorial on how to use GIT with SAS DI.

 

In the tutorial you are only versioning a single SAS DI job. How would you implement versioning of a solution with 100+ individual SAS DI jobs. The SAS DI jobs are placed in a folder structure reflecting the execution order of the jobs for loading a data warehouse solution.  E.g.:

 

 

Project Folder

  Folder A

     Job 1

     Job 2

  Folder B

     Job 3

     Job 4

...

 

 

There are functional dependencies between the jobs in Folder A and Folder B - e.g. Job 3 uses the result from Job 1

Would you still archive each individual job or would you archive on a per folder basis - and in this case: on which level in the folder structure?

 

Best regards

Martin

Hi,

You have the following options:

archive the top Project folder (folders and jobs below)

archive Folder A, Folder B separately (and jobs below)

It will work.

 

99.Multi_folder_job_archive.PNG

 

You also have the option to make all the operations programatically, using SAS GITFN functions. See Danny Zimmerman's SAS Global Forum Paper: SAS Functions to Drive Source Control with GIT

Hi, 

 

Does this provide any extra functionality than already provided with archiving to SVN or CVS? Of course I prefer GIT but was looking for some additional functionality, in particular when comparing SAS files. 

 

regards. 

 

Richard

 

 

 

Hi, 

 

Thanks for your answer BogdanT. - Keep up the good work 🙂

 

Two additional questions - we are trying to build a process around GIT using the branching structures described in  https://www.atlassian.com/git/tutorials/comparing-workflows/gitflow-workflow 

 

  1. Is there any way we can control the way SAS DI selects which branch to use? Right now all checkins ends in our main branch, rather than the feature branch for the individual feature which the developer is working on?
  2. Merging of spk packages in GIT - can it actually be done?

 

Best regards

Martin

is it possible to use this plugin with previous releases of SAS DI (4.902 or 4.903)?

>From 4.904 only.

Hi @mh_martin-hanse,

 

Were you able to find an alternative for DI to work with the GIT branches?

 

You can share?

 

thanks

Hello @mh_martin-hanse and @kassialua1 ,

 

Currently the git plugin for DI Studio does not support archiving SAS Packages to git branches.  Thus, it also does not support merges.

 

If this is a feature that should be considered for the future, consider contacting @VincentRejany  at SAS.

 

Thank you,

Eric

 

Thank you for the clarification Eric.

Thank you for the clarification Eric.

HI!

 

Is it possible to use GitLab? instead of github

Is the GUI for GIT within SAS DI functionality only available if using SAS Servers in the Windows Environments?  I have tried following the exam and it doesn't seem to work for remote linux servers hosted on AWS connecting to Bitbucket.

Hi gra_in_aus, only tested the GIT GUI for DI 4.904 on Windows.

Let me sure I understand well: you tried to replicate on a SAS on Linux DI installation?

 

Hi nettless1,

I think it should be possible to use gitlab. The GitLab project URL should be accessible from the SAS Machine where you have DI installed.

GitLab is still using Git technology and commands behind.

Hi BogdanT ,

 

Wanted to know if the GIT integration with SAS Windows Client was able to interact with a SAS Server hosted on Linux? As looking at the GUI interface you cannot select a SAS Server or navigate around a Linux environment from SAS DI.

 

 

 

How to safely get rid of this setup?

 

Hi,

I would like to ask if maybe now there is a possibility to configure Git plug-in not to commit on branch master automatically but to redirect to a specific branch?

And second question - does Git plug-in might work with Azure DevOPS Repos instead of GitLab / GitHub?

 

Thank you in advance for your reply,

Greg

Hello @GregCh ,

Currently the git plug-in in DI Studio can only commit to "master" branch.  This is a known issue and a future maintenance release will allow users to select a commit branch of their choosing.  

 

The git plug-in does work successfully with Azure DevOps repos.

 

Eric

 

Hello,

 

Is it possible to run "Archive a SAS Packege" in batch mode like export/import .spk packages?

 

/Greg

Hi,

 

Managed to successfully use the GIT interface using SAS DI :).  Are there any plans to improve the GIT Functionality in SAS DI (similar to SAS Studio or EG)?  As it's not a useful interface at the moment when compared to using a GIT repo against deployed code from SAS DI jobs.

@gra_in_aus please try to be more specific on what you need. Then put that into a SASWare Ballot post.

SASware Ballot Ideas - SAS Support Communities

That said, SAS is putting its main developing efforts into Viya, so I wouldn't expect any major developments on the 9.4 platform (including DI Studio).

Version history
Last update:
‎08-28-2020 05:42 AM
Updated by:

sas-innovate-2024.png

Don't miss out on SAS Innovate - Register now for the FREE Livestream!

Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.

 

Register now!

Free course: Data Literacy Essentials

Data Literacy is for all, even absolute beginners. Jump on board with this free e-learning  and boost your career prospects.

Get Started

Article Labels
Article Tags