BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
Biplob94
Calcite | Level 5

I have a SAS program (multiple macros deep) that has:

1. Several input datasets with several thousands of fields, all being read in.

2. Few output flat files with hundreds of fields

The issue is that the intermediate datasets are so large it is causing space issues and the performance is really bad. I want to identify only those fields on the input side that are essential, and use a keep statement.

Besides manual tracing, is there any tool or method that I should use? We have sas (r) 9.1.3 on aix 5.3

I appreciate any guidance

Biplob

1 ACCEPTED SOLUTION

Accepted Solutions
Hima
Obsidian | Level 7

The below code creates TABLE_A with the records that produces rows that are part of the first query only.

PROC SQL;

CREATE TABLE TABLE_A AS

SELECT * FROM TABLE1

EXCEPT

SELECT * FROM TABLE 2;

QUIT;

View solution in original post

4 REPLIES 4
art297
Opal | Level 21

What happens between initial input and final output.  Obviously, you have to keep the fields that are on the final output's keep statement, but what other processes/datasteps/sql code/procs are used during the process?

Such tasks can either be trivial or extremely difficult.  Depends upon your code.

Hima
Obsidian | Level 7

The below code creates TABLE_A with the records that produces rows that are part of the first query only.

PROC SQL;

CREATE TABLE TABLE_A AS

SELECT * FROM TABLE1

EXCEPT

SELECT * FROM TABLE 2;

QUIT;

Astounding
PROC Star

Biplob,

The first thing to consider is whether you want to improve disk space usage, performance, or both.  I suspect it's "both", since macros tend to be used repeatedly.

The disk space issue can probably be solved by creating views instead of data sets.  Both data steps and SQL can create views, and it is possible to use a view as input and another view as output of the same step.  Without seeing the code (and I'm not really asking for that), it's hard to be more specific.

Performance will improve by adding KEEP=, but there is no shortcut.  You have to manually work through the code to figure out what variables are needed when.  You could start with the variables in the final output, but other variables might be needed as the program begins.  For example, variables might be used to subset observations, or to make calculations, but might not be needed after that point.  The only tool that can figure this out is the human brain.

One style that I like is to add to the outermost macro a set of %LET statements:

%let keeplist1 = a long list of variables;

%let keeplist2 = a different list;

Then refer to &KEEPLIST1 and &KEEPLIST2 in later code.  This makes the programming easier to read, update, and debug.

Note that there is a difference between KEEP= on the SET statement and KEEP= on the DATA statement.  The first limits what you read in, and the second limits what you save.  You might find steps that use KEEP= on both, with different sets of variables.

To test your results, I suggest you run on either small data sets or on just one input data set using your current set of macros.  Save the result (preferably as a SAS data set).  Then after modifying the code, use PROC COMPARE to see if the new output differs from the old.

Good luck.

Biplob94
Calcite | Level 5

Thanks to everyone who responded. I now know what to do.

This is the first time I used a forum for help and am very impressed by the depth and promptness of the answers.

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

What is Bayesian Analysis?

Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 4 replies
  • 859 views
  • 6 likes
  • 4 in conversation