Help using Base SAS procedures

Why doesn't SAS keep only the variables needed for certain procedures?

Reply
Occasional Contributor
Posts: 17

Why doesn't SAS keep only the variables needed for certain procedures?

Is there a design reason that SAS does not create an implicit keep statement for procedures that specify variables? For example, the proc freq tables statement lists all the variables used in the procedure. Yet there is a significant speedup when I add a keep statement specifying those variables. Why doesn't SAS implicitly create this keep statment?

 

 

Super User
Posts: 5,424

Re: Why doesn't SAS keep only the variables needed for certain procedures?

Great finding!

Never occurred to me that this could be the case.

If you are correct, this should be fairly "cheap" for SAS to implement.

Data never sleeps
Respected Advisor
Posts: 3,799

Re: Why doesn't SAS keep only the variables needed for certain procedures?

[ Edited ]

I would like to see your test case(s).  As they say it didn't happen if there ain't no picture.

Community Manager
Posts: 2,952

Re: Why doesn't SAS keep only the variables needed for certain procedures?

I'm not sure that this is the case in general, especially with the BASE engine.  But for certain types of data sources (such as from 3rd party databases) it's possible that there is a back-end cost you're seeing.

 

In recent versions of SAS (esp 9.3 and later), many procs will optimize their table access to push work to the database.  But some constructs can break/prevent that.

 

I agree with @data_null__: share a specific example of what you're seeing.

Super User
Posts: 11,343

Re: Why doesn't SAS keep only the variables needed for certain procedures?


Zelazny7 wrote:

Is there a design reason that SAS does not create an implicit keep statement for procedures that specify variables? For example, the proc freq tables statement lists all the variables used in the procedure. Yet there is a significant speedup when I add a keep statement specifying those variables. Why doesn't SAS implicitly create this keep statment?

 

 


Proc Freq only lists all of the variables if you don't supply them on a TABLES statement.

See the difference between

Proc freq data=sashelp.class;

run;

 

and

Proc freq data=sashelp.class;

   tables sex age;

run;

Valued Guide
Posts: 860

Re: Why doesn't SAS keep only the variables needed for certain procedures?

Ballard, I think what they are saying is that SAS still reads in all of the variables, even if you only put 1 variable in the table statement.

SAS Super FREQ
Posts: 3,752

Re: Why doesn't SAS keep only the variables needed for certain procedures?

[ Edited ]

One "design reason" is that many procedures have an OUTPUT statement that enables you to create an output data set that contains ALL the input variables AND the created variables. For example, in PROG REG, if you say

OUTPUT out=MYOUT P=Pred R=Resid;

then the output data set contains all input variables in addition to the predicted and residual variables from the regression.

Because the output from one procedure is often used as the input to another procedure, this prevents doing a separate MERGE between procedure calls.

 

Furthermore, PROC REG is an interactive procedure, so you can specify a model, then execute the RUN statement.  After the model has run, you can specify the OUTPUT statement to get the output data set.  In other words, the procedure does not know when it encounters the first RUN statement whether there will be an OUPUT statement later in the program.

 

Ask a Question
Discussion stats
  • 6 replies
  • 386 views
  • 2 likes
  • 7 in conversation