structure of a sas program

Reply
Occasional Contributor
Posts: 8

structure of a sas program

Hi all,

 

I have a general question about how to structure my sas program, it goes like this:

First, I import the original data and do some data cleaning and data transformation and get data A, and I continue to do some analysis to get data B. Before I go futher, I realized I need data X which is generated by data A. So I write code to generate data X with which I can combine with data B to continue...

 

So the process of generateing X is like a branch. I am wondering it is better to put this branch in a separate SAS file and call it when needed or just leave this branch in the main code. I am afraid if I need to generate too many branches, these branches would interfere with the main code.

 

Thank you,

Grand Advisor
Posts: 10,241

Re: structure of a sas program

Details, details, details.

 

Some of this is style choice. I tend to separate the read, clean and initial recoding into separate program files. Partially to keep the amount of code readable but also if reading large files then re-reading is inefficient. I also prefer to have analysis steps segregated, possibly by type of analysis (simple summaries together, regressions together, other statistical tests). But the scope of the project may not require all of that.

 

I also try to minimize the number of source datasets. I can't tell whether you are creating separate data files A and X where X may be A plus some variables. If that is the case, I tend to go back to an earlier code generated data set such as A and add additional recoded variables or whatever is needed, then rerun previous analysis to ensure I haven't changed things inadvertently before proceeding to the steps that needed the additional variables. I find this tends to minimize the "oops, I used the wrong data set".

 

If you have lots of dependencies then 1) document 2) take care naming things, 3) test everything and 4)document (at this point you'll like need to add to the previous documentation)

 

 

Respected Advisor
Posts: 4,998

Re: structure of a sas program

A SAS program is always a series of DATA and PROC steps.  But the steps need not be saved as one huge program.  You could create a series of programs, such as:

 

1_create_and_clean_A.sas

2_create_B_from_A.sas

3_create_X_from_A.sas

4_combine_X_with_B.sas

 

You would just run them in the order indicated by the names.

 

Whatever works, and makes it clear for you to understand where to find the code.

 

If you need to re-use some of the code for many different incoming data sets, that becomes a different question with a different answer.

Respected Advisor
Posts: 3,066

Re: structure of a sas program

[ Edited ]

An interesting question.

 

I find that as your SAS programs get larger it becomes quite natural and logical to split them based on common functionality. I also use a number at the start of each program name to indicate the running order as already suggested by @Astounding.

 

With the larger SAS applications I work with, I also use the concept of levels as used in data warehousing. For example all programs starting with the number 100 relate to reading and sourcing data, 200 - level programs primarily transform, and derive data, and 300 - level programs primarily prepare data for analysis and reporting.

Occasional Contributor
Posts: 8

Re: structure of a sas program

Very helpful. Thank you for you responses. 

Ask a Question
Discussion stats
  • 4 replies
  • 351 views
  • 4 likes
  • 4 in conversation