Anybody have some materials / advice on creating a program and how to structure it?
For example, I'm working on a program that is currently 600+ lines long and it might end up being 1000 when it is all said and done. It has various data steps, proc sqls, macros, etc. It creates 40+ tables, some temporary, some not. I still learning SAS and haven't really seen anything on structure a long process.
I've started creating user defined macros in the program to kind of help structure what that series of code is doing. Do people tend to do this? I've been thinking of going back and doing this for other steps that I didn't include in macros. Should I create a program for each one of these macros and then %INCLUDE them in a main program that calls them all and then runs the macros?
Any advice would be great!
Thanks,
Are you a BA? Don't break what works. sometimes people write long converted programs to do very long converted process's.
@pchappus wrote:
Anybody have some materials / advice on creating a program and how to structure it?
For example, I'm working on a program that is currently 600+ lines long and it might end up being 1000 when it is all said and done. It has various data steps, proc sqls, macros, etc. It creates 40+ tables, some temporary, some not. I still learning SAS and haven't really seen anything on structure a long process.
I've started creating user defined macros in the program to kind of help structure what that series of code is doing. Do people tend to do this? I've been thinking of going back and doing this for other steps that I didn't include in macros. Should I create a program for each one of these macros and then %INCLUDE them in a main program that calls them all and then runs the macros?
Any advice would be great!
Thanks,
First: 1000 lines don't really mean "long program"
if a process is moderately long I tend to divide into functional areas and create separate self-contained program files for each function.
Typically those would be 1) read the data (if external file is source); 2) clean or verify that the data is useable (no missing values for key variables, variables in defined ranges, all the groups such as organization that should be present are); 3) Restructure data or add variables that are needed for analysis; 4) analysis likely creating data sets for: 5) reporting.
Pieces that are likely to be needed in multiple parts of the such a library definitions, formats, informats, macros (cautiously) would go into a program file to include at the top of each of those (or make sure it is run before any of them).
Those may have sub-parts such as reading from external files from multiple sources likely requires a separate read and clean step and likely means that the restructure step means combining data.
If I have to go back to modify a variable or such then rerunning the dependent pieces are relatively easy.
One strong hint if when you get to the "report" phase, such as would appear between ODS <destination>; Ods <destination> close; statement really shouldn't have any data manipulation. That keeps the output coding much cleaner.
Splitting off macros into separate programs is a good idea. A really good way of handling these is by setting up an AUTOCALL macro library. See here for details:
The main advantage of an AUTOCALL library is you don't need to %INCLUDE the programs. SAS does that for you automatically.
I've built some pretty large SAS applications and I always use AUTOCALL macro libraries. They help keep your SAS applications structured and tidy as well.
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.