Re: SAS macros in Git

JackHamilton · Posted 03-31-2021 07:28 PM

Hello,

It's possible to read individual files from GitLab, GitHub, and their relatives using URL filerefs. What I don't see is a way to maintain a macro library in Git without keeping a copy of all the macro files in a file directory on the server or workstation.

Is this possible? Am I just overlooking something?

Example that doesn't work:

filename gitmacs url 'github.com/my_sas_macros';

options append=sasautos gitmacs ;

I don't think there's a way to get a file directory from Git through a filename in SAS, which is probably what's needed. But maybe I just don't know what it's called, or maybe it's not available until a later version (we're on 9.4 m5). The solution would have to work in base SAS batch jobs as well as SAS Studio, EG, and so forth.

Tom · Posted 03-31-2021 10:19 PM

I am not much of a GIT or GITHUB user but from my limited understanding the GIT model is that you create a local folder for working with the repository and use GIT tools to keep it in sync with the repository.

If you follow that model then you can just use normal SAS autocall features to locate and use the files. If you want to make sure you are using the most up to date version of the files just call the appropriate GIT command line tool command to refresh your local copy for running your SAS code.

What I used to do to make sure my current SAS session would always go find the most current macro definition instead of using a version that might have been already compiled in this session was to use PROC CATALOG to erase the compiled macros from the catalog where SAS stores them in the WORK library. Make sure to turn on MRECALL option also.

JackHamilton · Posted 04-02-2021 05:53 PM

The typical usage is through a local copy of the repository, as you describe, but it is also possible to edit files directly through the web interface without making a local copy.

I won't try to defend that as a good practice, but if I can tell the users "No, you can't do what you want, it's just not possible in SAS", they might decide that the standard methods will work for them.

ChrisHemedinger · Posted 04-01-2021 08:37 AM

In 9.4 M6 (sorry Jack) we now have support for Git functions so it's possible to clone entire repositories to your local session. This allows you to maintain code in Git using whatever folder structure you like, and then pull it to your session as you need it.

Example:

/* create a temp folder to hold the shared code */
options dlcreatedir;
%let repoPath = %sysfunc(getoption(WORK))/shared-code;
libname repo "&repoPath.";
libname repo clear;
 
/* Fetch latest code from Git */
data _null_;
 rc = gitfn_clone( 
   "https://gitlab.mycompany.com/sas-projects/shared-code/",
   "&repoPath.");
run;
 
options source2;
/* run the code in this session */
%include "&repoPath./bootstrap-macros.sas";

I've got several resources about using Git with SAS, including "How to organize your SAS projects in Git".

If you have SAS University Edition or an account on SAS OnDemand for Academics, you can try these techniques with my Netflix data and SAS code.

SAS For Dummies 3rd Edition! Check out the new edition, covering SAS 9.4, SAS Viya, and all of the modern ways to use SAS!

JackHamilton · Posted 04-02-2021 03:01 PM

For reasons I can't say I understand, some of the users of this particular set of macros would prefer to store everything in Git when possible, with no materialization in the local file system.

That's not possible right now, and if it were I think it would be slower and less reliable than using the native file system.

But the concept underlying it seems reasonable to me: use Git as an aggregate file system. Microsoft has done something similar with https://github.com/microsoft/VFSForGit , and there's also GitFS, but neither of those solutions is complete. SAS Institute could consider implementing a GitFS client inside base SAS itself, which would make it available and supported everywhere. Syntax could be something like "filename mystuff gitfs 'https://git.example.com/projects/bigproject' authdomain='myauth';" and usage could be like "%include mystuff(pgm.sas);" or "options append=sasautos mystuff;".

ChrisHemedinger · Posted 04-02-2021 04:34 PM

Git relies on interaction with a file system so the idea of macros that you store in Git and use in a "file-less" manner is a bit limiting. However, I think that when it comes to simply consuming/using the macros, we can get pretty close using existing supporting mechanisms.

One approach relies on the design of the repository itself. In his library of useful macros at https://github.com/sasjs/core, @AllanBowe includes a script that concatenates all of the macros together into a single file so that it's simple to download/compile with just two lines:

filename mc url "https://raw.githubusercontent.com/sasjs/core/main/all.sas";
%inc mc;

This requires that as part of a commit process or triggered as a Git action, you run this script so the big file is always current with the latest code. This is a model that lots of Git-hosted code libraries use (Javascript libraries with lots of components are good examples). You maintain the components in their individual files, but you deploy them as a single-file unit.

Or, if you have just folders of programs that you want to leverage as autocall macros, you can use GIT functions to clone the repository to your SAS session and then options append to your SASAUTOS, as you illustrated. This example doesn't do quite that, but it is a working example of pulling code and data from Git and running it directly in your SAS session.

/* These steps COULD be a macro to create a local folder and clone */
/* a repository of SAS macros into it.                             */
/* Then append the local folder path to SASAUTOS                   */

/* Clone to a temp space -- I don't even need to know where */
options dlcreatedir;
%let repoPath = %sysfunc(getoption(WORK))/sas-netflix-git;
libname repo "&repoPath.";
libname repo clear;
 
/* Fetch latest code from GitHub */
data _null_;
 rc = gitfn_clone(
   "https://github.com/cjdinger/sas-netflix-git/",
   "&repoPath.");
 put rc=;
run;
 
options source2;
/* run the code in this session */
%let _SASPROGRAMFILE=&repoPath./code/import_netflix_activity.sas;
%include "&repoPath./code/import_netflix_activity.sas";
%include "&repoPath./code/find_duplicates.sas";
%include "&repoPath./code/netflix_report.sas";

If you're using a SAS environment to maintain this macro library and you want to leverage built-in Git integration, then I think systems like GitFS abstract too much away from you. You want that Git goodness of clone, branch, commit, etc. to be a more native Git-like experience. Git is new to lots of SAS programmers, but it's becoming an essential skill for anyone who collaborates on code.

SAS For Dummies 3rd Edition! Check out the new edition, covering SAS 9.4, SAS Viya, and all of the modern ways to use SAS!

JackHamilton · Posted 04-02-2021 06:08 PM

The idea is that the usual practices would be used for creating and editing code, so branching, commiting, and other Git features would still be available to people writing macros. The gitfs interface would be used only by consumers of the code.

We have over 250 macros in 8 directories that have accumulated over the years. It would good to have an excuse to clean them up.

My own use of git is mostly with "solo" code - I'm the only user and author. Even in that environment, I've found git to be very helpful, just because of the change tracking. And the knowledge that if I mess something up I get easily get back an earlier version.

JackHamilton · Posted 04-02-2021 08:22 PM

If we ever get a chance to clean up our decades-old macro library, that would be a viable solution. We might have an opportunity to do that during the 9.4m7 installation.

AllanBowe · Posted 04-02-2021 05:11 PM

Hi Jack - great question!

Deploying macros to a filesystem can indeed be problematic for a number of reasons:

* it's another thing to deploy/consider when pushing code

* it's hard to be sure which macros are used where

* changing a single macro can screw up a bunch of applications

In building Data Controller and other SAS Apps we have addressed this scenario head on. Not only that, we have open-sourced our framework for others to use, and it works the same on SAS Viya, SAS 9 with metadata, and traditional SAS Base. There is NO need to store macros on the filesystem.

The idea is this - you have a LOCAL git repository, containing all your SAS Macros, Includes, Jobs and Services. Just like any other language, you check out your feature branch (locally, in your preferred IDE) and write your code. When you _execute_ or test that code, you compile it, then run it in a SAS runtime.

Hang on - I hear you say - compile? What's that?

In our framework, 'compiling' means "concatenating all the macros and includes that relate to a particular job (or web service) so that they are all self contained in a single file". The file can then be 'built' (prepared for deploy) and finally, deployed. It would be deployed as a Stored Process in SAS 9, or a Job in Viya.

Hang on (again) I hear you say. How does the compiler know which macros / includes relate to my particular job (or web service) ?

For that we insist that the developer declares their dependencies (macros & includes) in the program header, using doxygen format. This has the additional benefit that we can auto-generate HTML documentation for our SAS project.

Slow down! How does the compiler know where my jobs, services, macros and includes are located?

For that, we have a source of truth - we call it the sasjsconfig.json file. It describes the entire project, containing LOCAL file locations, jobInit / jobTerm programs, macro variables and - critically - the targets.

What is a target?

The target is the place to which you deploy. So if you're working on your feature branch, and you'd like to compile your SAS code, create a build pack and deploy it - but you don't want it to affect any other of the 1000 users on your SAS box - you create a dedicated target (lets call it "dave") and it will represent a few critical things:

* The URL of the SAS server to which you will deploy

* The server type (SAS9 or SASVIYA)

* The folder (metadata folder in SAS 9, SAS Drive folder on Viya)

Then you can simply run one command to compile, build & deploy your LOCAL sas code into the REMOTE sas server:

sasjs cbd -t dave

Your entire project remains in GIT, you can use your favourite IDE (we recommend VS Code, check out the SASjs extension in the marketplace), and you can test your jobs & services completely in isolation from other developers, merging once you're happy with the results and pushing to a shared dev / test / prod location. Completely driven by git merges and suchlike.

Here's a short overview:

If you'd like to try it out, visit this URL:

https://gitpod.io/#github.com/sasjs/template_jobs

This will open up a container with sasjs already installed, and some sample jobs. You can run `sasjs cb` to compile & build the jobs (see sasjsbuild folder). You can run `sasjs doc` to generate the html documentation (you can see the data lineage there too). You can also run `sasjs lint` to see which SAS programs are breaking the quality rules.

You can also do this locally to by installing sasjs and running `sasjs create MYPROJECT -t jobs`.

Will stop there! In summary, sasjs enables GIT workflows by letting SAS developers work LOCALLY, compiling code into fully self-contained programs, and deploying / executing them in whichever REMOTE sas runtime you would like to use, WITHOUT storing any macros, programs, or other sas code on the target filesystem.

For Viya this requires a client / secret, for SAS 9 it requires SSH access (but we're working on an update that will use the SAS 9 REST API).

/Allan
MacroCore library for app developers
Data Workflows, Data Contracts, Data Lineage, Drag & drop excel EUCs to SAS 9 & Viya - Data Controller
DevOps and AppDev on SAS 9 / Viya / Base SAS - SASjs

Patrick · Posted 04-03-2021 03:50 AM

Do I understand this right that this also means the Git server becomes a vital part of a SAS production environment and therefore must comply with all the SLAs and failovers for such an environment?

AllanBowe · Posted 04-03-2021 07:46 AM

Not at all - GIT does not need to be connected to SAS at all. Rather it's something that each developer connects to, in order to build their SAS projects.

Once code is linted, documented, compiled, built, deployed and unit-tested - then it is promoted (each job / service being fully self contained, either in metadata or a Viya job) through the usual deployment process into production.

If you choose to make that part of your git / dev ops automated workflow (eg push to production upon merge to a main or master branch) that is up to you.

I use git exclusively from within my local VS Code instance and push my finished jobs to SAS when they are ready (using `sasjs deploy`). Happy to jump on a call and explain more.

/Allan
MacroCore library for app developers
Data Workflows, Data Contracts, Data Lineage, Drag & drop excel EUCs to SAS 9 & Viya - Data Controller
DevOps and AppDev on SAS 9 / Viya / Base SAS - SASjs

Xeonzinc · Posted 01-26-2022 12:33 PM

Hi,

A bit late to the party, but I have built a system for this at my organisation (built in SAS code), so it is possible to do what you are describing, but was quite a bit of complex work. We have a local Gitlab installation which serves as the single storage repository for a wide range of macros across a number of projects. Users may want to utilise any of these macros in various other projects, so the code ecosystem is very interlinked.

These are defined / loaded within a users session using a custom %include type statement in a format like such:

%gl_includeCode(GL_PROJECT=project_name, GL_VERSION=Latest);

or

%gl_includeCode(GL_PROJECT=project_name, GL_VERSION=1.0, GL_FILE=/important_macros/macro_x.sas);

And then macros defined in that project can be called/ utilised as normal by the end user, without ever having to create a local copy. The systems allows for fully nested Gitlab references, and also has traceability tools to users can track what is going ok.

The underlying code is fairly complex, but relies on the Gitlab API to get a list of all SAS files present, and then individual filename statements for every piece of code to include them. It would definitely be nice if there was a more automated way to do this within SAS.

This solutions solves many of the issues we had with a similar structure using standard %includes based on the SAS file system and actually allows for a lot of novel controls which would otherwise be impossible.

Every %GL_includeCode call can be logged so we have full visibility of all macro users for a project / file.
The use of a custom include function also means we can build in automated nested include tracking / presentation to end users, which would otherwise not be possible.
Normal Git versioning is fully accessible, so users can point to a specific version of a repository and never be concerned their referenced macro will change (but will receive log notes with suggestions to move to newer versions)
All the collaborative benefits /documentation of GitLab can be leveraged in a single place, by having code live on there as it's sole home.
Any code enhancements are forced to be added in the single shared repository, rather than getting lost in specific local repositories.

Quentin · Posted 01-26-2022 01:16 PM

@Xeonzinc, this sounds like the makings of a good user group paper!

JackHamilton · Posted 01-26-2022 01:21 PM

It does sound like a good paper, especially if the source code can be made public.

yabwon · Posted 09-05-2023 09:27 AM

Interesting solution, it looks/sounds a bit like the SAS Packages Framework I've been working since 2019, details are here: https://github.com/yabwon/SAS_PACKAGES

and the SAS Packages Repository (SASPAC): https://github.com/orgs/SASPAC/repositories

Bart

_______________
Polish SAS Users Group: www.polsug.com and communities.sas.com/polsug

"SAS Packages: the way to share" at SGF2020 Proceedings (the latest version), GitHub Repository, and YouTube Video.
Hands-on-Workshop: "Share your code with SAS Packages"
"My First SAS Package: A How-To" at SGF2021 Proceedings

SAS Ballot Ideas: one: SPF in SAS, two, and three
SAS Documentation