Solved: Structuring multiple code files in a project

braam · Posted 10-05-2019 04:09 AM

Hi All, I've been struggling with how to structure multiple code files involving in a project. It's a little generic question, but I would like to get your opinions on it. If you have anything you can recommend for me (like webpages), that would be also highly appreciated.

Let's say, I have a project where there are three different steps. The last step also includes two sub-steps. So, the structure of this project looks like this:

Step 1.

Step 2.

Step 3.

Step 3.1.

Step 3.2.

In such a situation, I would make a bunch of code files for each step/sub-steps and make a master code file for the project. The master code is connected with other code files for each step by using %include. This way, my master code file looks simple, and it provides me with a big picture. But I'm not sure if it's the right way or the most efficient way. I am also aware that some people just write everything on one code. Please feel free to share your insights on it.

hashman · Posted 10-05-2019 04:18 PM

@braam:

This is the way to go.

Myself, I invariably systemize my SAS program structure this very way.

Also, if I may suggest, if you use meaningful names for your step-by-step programs, prefix them with something like:

p00_<pgm name 1>.sas

p01_<pgm name 2>.sas

...

p0n_,pgm name n>.sas

so that they are sorted properly in the program file directory and thus indicate clearly which step follows/preces which. Usually, my p00_<>.sas is a "driver" program that calls the rest using %include.

Kind regards

Paul D.

View solution in original post

Kurt_Bremser · Posted 10-05-2019 04:40 AM

There's nothing wrong with the %include approach.

IMO it mostly depends on how you want to keep your codes in a versioning system. With multiple files for a single process you would need to treat a whole subdirectory as a single entity (and separate your projects accordingly), while with single code files you can keep all .sas files in one directory, as each file represents one process/project.

Maxims of Maximally Efficient SAS Programmers
How to convert datasets to data steps
The macro for direct download as ZIP
How to post code
Please vote for Provide Sequential Search Capability for Hash Objects
How to deal with locked files on UNIX

VDD · Posted 10-05-2019 01:58 PM

I use both approaches.

I find that using the %include helps me in modifying a smaller SAS program when rules change or need to be inserted into the big process.

For me I find that reading a 1000 lines of SAS code and being able to understand what is happening in that batch of Procs and data steps is easier than reading 10,000 lines of code that may be structured like Cobol. While I mean no harm to Cobol my friend you do have to do a lot of jumping around to understand what is happening when and why.

Reeza · Posted 10-05-2019 02:06 PM

@braam wrote:

In such a situation, I would make a bunch of code files for each step/sub-steps and make a master code file for the project. The master code is connected with other code files for each step by using %include. This way, my master code file looks simple, and it provides me with a big picture. But I'm not sure if it's the right way or the most efficient way. I am also aware that some people just write everything on one code. Please feel free to share your insights on it.

15 years of programming and this is still what I do...it works well, keeps your code clean and organized. I like the separation of different functions because it makes it clear where I need to go to change things when I need to and even if I come back to a project years later I can decipher what happened.

hashman · Posted 10-05-2019 04:18 PM

@braam:

This is the way to go.

Myself, I invariably systemize my SAS program structure this very way.

Also, if I may suggest, if you use meaningful names for your step-by-step programs, prefix them with something like:

p00_<pgm name 1>.sas

p01_<pgm name 2>.sas

...

p0n_,pgm name n>.sas

so that they are sorted properly in the program file directory and thus indicate clearly which step follows/preces which. Usually, my p00_<>.sas is a "driver" program that calls the rest using %include.

Kind regards

Paul D.

Patrick · Posted 10-05-2019 11:19 PM

@braam

That's pretty much the pattern I'm using.

I'd give the programs names some leading numbering so they sort in a logical sequence (like 005_<prg name>.sas). I normally number in steps of 5 so I've got some space to add something in-between later on.

I'd keep the programs belonging to a project in it's own project folder and I'd then use a macro variable in my master program for the %include statements. Something like:

%let projects=<path to project folder>;

%include "&projects/projectA/005_<prg name>.sas / source2;

This approach allows to move a whole project structure to somewhere else and one only needs to amend the path in a single place.

If possible then I'd set the &projects definition eventually even in the autoexec.

SASKiwi · Posted 10-06-2019 01:54 AM

I'm an enthusiastic proponent and user of the master code file (or program) approach. As well as calling all of a project's subsidiary programs, the big advantage of the master program is it can set up a consistent SAS environment for the whole project. This can include SAS options, macro variables for folder locations, SAS macro and format libraries, and so on.

hashman · Posted 10-06-2019 03:14 AM

@SASKiwi:

Agreed. Just as important, the master/driver program can read and process control files setting the stage for code generation and the separation of data from code, whose purpose is not to contain data (such as, but not limited to, hard coded values or IF-THEN-ELSE wall paper) but to contain logic processing the data, including the data stored in control tables.

As an important side effect, this achieves the freedom of the programming structure from the nightmare of change control, as the data controlling code generation are stored outside of the change control authority. In other words, if the code generator doesn't change (only the generated code does if the control data are edited), there's no need to check it in and out; and if it does change, it is the only program subject to change control.

I'd also say that whenever I work to create systems of this sort, the code generator is never a macro (let alone a ubiquitous "macro wrapper", for some reason considered by many a pinnacle of SAS programming) but straight SAS code that generates flat files for later %includes (formatted and intended exactly as I would write non-generated code), always stored in a specially designated directory, so that it can be reviewed. Needless to say, SOURCE2 is always on, so that if something untoward should occur during processing, the log would point exactly to where the error has occurred (as opposed to the line where the macro wrapper was called, that gives no clue to where the snag actually happened). I have to stress that by saying the above, I don't intend to start any suum cuique flame here. Macros are good for what they're good for, very useful, and I do use them myself. It's just my private opinion (based on a fair amount of experience) that as a system code generator, the macro language yields to the SAS language by leaps and bounds, not to mention that it's incredibly harder to debug. My rule of thumb is that when a macro gets over-parameterized and/or starts getting inundated with quoting functions, its functionality has reached the point at which the deficiencies of the macro language have become too evident and burdensome.

FWIW,

Paul D.

SASKiwi · Posted 10-06-2019 05:21 PM

@hashman - You are preaching to the converted here. Data-driven application generation is the way to go. Keeping control data separate and versioned is key. My preference is macro-driven code generation, but %include-type code generation is equally valid.

Sounds like we are on the same page here!

hashman · Posted 10-06-2019 07:33 PM

@SASKiwi: It surely looks that way!

Structuring multiple code files in a project

Re: Structuring multiple code files in a project

Re: Structuring multiple code files in a project

Re: Structuring multiple code files in a project

Re: Structuring multiple code files in a project

Re: Structuring multiple code files in a project

Re: Structuring multiple code files in a project

Re: Structuring multiple code files in a project

Re: Structuring multiple code files in a project

Re: Structuring multiple code files in a project

Re: Structuring multiple code files in a project

Register Today!

SAS Training: Just a Click Away