TLDR: which features of a minimal SAS licence could help, or should be supported by, a general SAS packaging framework?
First, apologies if this is not the appropriate board for this question. Please feel free to direct me elsewhere.
To my knowledge, SAS lacks a packaging system, at least in the sense that other languages and systems use the term. This would include a common framework for:
The nearest thing I can find is SAS/IML packages, although that system is rather basic. I have found the packaging system in R to be an excellent method not only for authoring and distributing utilities, but also entire analysis projects. I have the same aspirations for a packaging system in SAS.
I would intend to write such a system entirely in base SAS, to ensure its portability and independence from particular licensing purchases. So my question is:
Which features of base SAS are available to all users with a minimal licence, and are relevant to this endeavour?
I count:
I'm aiming for a modular design, with a minimal set of general features implemented in a minimal subset of SAS, which provide the basis for more component-specific features.
I have been prototyping such a packaging system but it's definitely time to step back and seek advice about design and architecture before going much further.
the cheaper you go the harder you make your employees work. Limiting an employee to software tools creates a process of turnover, since they become limited in their advancement through exposer. I followed that rough through 20 years and now the company is finally understanding the need to allow their employees access to the tools they can demonstrate provide productivity in the employees area of working.
I guess you are coming from the point of view of wanting to build your own application in SAS and then be able to package a bundle of executables that can be marketed in its own right. This isn't the first time this topic has come up. There have been various attempts over the years particularly during the period that SAS/AF (SAS's first application menuing system) was popular back in the 1990s.
As you can probably gather, none of these attempts went very far and the bottom line is SAS's architecture simply doesn't lend itself to what you are proposing. For example every single SAS procedure your application uses would have to be included in its entirety along with the architecture to support it.
In my experience, most SAS practitioners use SAS's multi-tier server-based architecture. Exclusively PC-based installations are no longer as popular as they once were, with the obvious exception of the education industry. You can't "package" server based solutions and it would be difficult if not impossible to package SAS on PCs.
I would have thought that providing your application as Software As A Service might be worth considering. By hooking into a cloud-based SAS installation you could rent out any applications you built on it - with the appropriate licensing agreement from SAS of course.
I'm imaging you might be coming from an ecosystem like perl and cpan, or python and pip. SAS is not like other languages and systems where developers have very fine grained control of exactly which features are imported into their program. SAS Foundation is very much an everything and the kitchen sink affair. To find out what you get with it the simplest thing to look at is the procedures guide and look in the Base SAS section here.
What is the problem you are actually trying to solve? Are you attempting to package the software for a desktop system? If you are the simplest thing is deploying everything that is licensed. Yes, a SAS installation is big, but hard disk space is now at a point where a deployment of SAS takes less than a $1 worth of hard disk space so trying to optimize what you do and don't deploy is not going to be something worth spending time on.
Thanks all for your responses.
Perhaps if I describe our specific problems which I'm trying to solve, my needs will be clearer.
The problem is that at work we currently do a lot of copy-paste programming. That is, when performing an analysis, we often remember or are informed "oh, someone did something like that a few (months/years) ago, check out this code". Or multiple people, performing very similar tasks in parallel or series (e.g. after handover), re-write the same formats and data pipelines over and over again, because there's not a mature culture of code re-use. General-purpose utilities would be very beneficial due to the routine nature of much of our analyses, but are almost completely absent. It's very inefficient. We could benefit immensely from libraries of common formats, functions, macros, concordance datasets, and so on, but nobody has made the effort to develop this. Partly this is because SAS lacks a formal packaging system which would provide a helpful framework for developing and using such a library.
That's the utilities side of things. But fitting analysis projects themselves into a package framework also has its advantages. Currently we have no standards around project structure, so each analyst's codebase is very idiosyncratic, which makes things harder to find and understand. The basic logic of the analysis scripts is hard to follow because everything is defined in-line and ad hoc, rather than having macros defined in a macros/ directory, formats defined in a formats/ directory, and so on. We also often have problems with irreproducible environmental configurations, since people also do this manually. I've been working on a configuration routine which runs automatically on package loading, which would go some way to resolve this problem.
The only reason I asked about what features are included in a minimal SAS license is that I wished such a packaging system to be usable by as many people as possible. In fact at work we have rather an inclusive licensing arrangement including SAS/IML and many others.
Why develop a new tool? Why not use Git? It's not well incorporated into SAS at the moment, but support is increasing.
Your requirements are not unique to SAS, and Git has been used for similar things in many projects (Linux being perhaps the best known example). There's no reason your package has to be written in SAS.
But it doesn't sound like your fundamental problem is a lack of software. You can do most of what you want on a regular shared file system (or ftp server). Your organization is lacking a commitment to doing it, and new software won't fix that, it will only make the process easier.
You can already set up shared macro and format libraries under central control. You can create folder structure templates. I think shared functions are harder (when I last looked, which was a few years ago, it was obvious that SAS hadn't thought through how to use shared functions, but that might be different now).
I echo @JackHamilton's post. You can go a long way with packaging SAS simply by using a version control tool of your choice. You set up separate folders for each of your SAS applications, then add sub-folders for SAS programs, another for SAS macros, maybe another for application metadata and so-on.
If your applications share macros, then setup a separate shared macro folder. Then you need what I call a SAS control program which sets up your SAS environment. This will define your SAS AUTOCALL macro libraries so your programs will always find them. Ditto for permanent SAS format libraries, and so on and so forth.
I work in a team of 6 SAS developers and we have no problem sharing and reusing all our common macros or reading our permanent SAS formats. And using version control means anyone can reliably make changes and once thoroughly tested and reviewed, deploy them (automatically) so they will work reliably in a production environment.
"because there's not a mature culture of code re-use"
and that is the core of your problem, and you will not fix that with any software tool, as it is a "people" problem. Believe me (20 years of SAS, and 35+ years of IT experience). Your second paragraph ("each analyst's codebase is very idiosyncratic") is even more telling.
If people want to cooperate, a bunch of shared directories (like /sasprog/formats or /sasprog/macros), some template files and a dictionary of available elements (code pieces, formats, lookup tables, whatever) will be enough. And a solid bunch of rules that HAVE to be followed in terms of coding style and documentation, so it's a pleasure for everyone to read each other's codes.
At the core of cooperation is always documentation, documentation, documentation. And did I mention documentation?
A "packaging" system would be counter-productive, at that. Nobody will unpack a package just to see if there's usable code in it.
What you rather need is a code repository and versioning system, like git. But such a tool will only work if people are actually willing to use it.
While I appreciate everyone's advice, I would have appreciated an answer to the question even more. A little faith that I am aware of, and am working on, some of the other issues that were raised would not go astray.
You got an answer. You didn't like the answer. Those are two different things.
You haven't explained why Git and shared libraries won't solve your problem. With a commitment by your organization to encourage/require their use.
Also, if you are going to set up shared code and libraries it would be best to have a code review procedure in place as well.
The SAS Users Group for Administrators (SUGA) is open to all SAS administrators and architects who install, update, manage or maintain a SAS deployment.
SAS technical trainer Erin Winters shows you how to explore assets, create new data discovery agents, schedule data discovery agents, and much more.
Find more tutorials on the SAS Users YouTube channel.