Biplob, The first thing to consider is whether you want to improve disk space usage, performance, or both. I suspect it's "both", since macros tend to be used repeatedly. The disk space issue can probably be solved by creating views instead of data sets. Both data steps and SQL can create views, and it is possible to use a view as input and another view as output of the same step. Without seeing the code (and I'm not really asking for that), it's hard to be more specific. Performance will improve by adding KEEP=, but there is no shortcut. You have to manually work through the code to figure out what variables are needed when. You could start with the variables in the final output, but other variables might be needed as the program begins. For example, variables might be used to subset observations, or to make calculations, but might not be needed after that point. The only tool that can figure this out is the human brain. One style that I like is to add to the outermost macro a set of %LET statements: %let keeplist1 = a long list of variables; %let keeplist2 = a different list; Then refer to &KEEPLIST1 and &KEEPLIST2 in later code. This makes the programming easier to read, update, and debug. Note that there is a difference between KEEP= on the SET statement and KEEP= on the DATA statement. The first limits what you read in, and the second limits what you save. You might find steps that use KEEP= on both, with different sets of variables. To test your results, I suggest you run on either small data sets or on just one input data set using your current set of macros. Save the result (preferably as a SAS data set). Then after modifying the code, use PROC COMPARE to see if the new output differs from the old. Good luck.
... View more