BookmarkSubscribeRSS Feed
Zelazny7
Fluorite | Level 6

Is there a design reason that SAS does not create an implicit keep statement for procedures that specify variables? For example, the proc freq tables statement lists all the variables used in the procedure. Yet there is a significant speedup when I add a keep statement specifying those variables. Why doesn't SAS implicitly create this keep statment?

 

 

6 REPLIES 6
LinusH
Tourmaline | Level 20

Great finding!

Never occurred to me that this could be the case.

If you are correct, this should be fairly "cheap" for SAS to implement.

Data never sleeps
data_null__
Jade | Level 19

I would like to see your test case(s).  As they say it didn't happen if there ain't no picture.

ChrisHemedinger
Community Manager

I'm not sure that this is the case in general, especially with the BASE engine.  But for certain types of data sources (such as from 3rd party databases) it's possible that there is a back-end cost you're seeing.

 

In recent versions of SAS (esp 9.3 and later), many procs will optimize their table access to push work to the database.  But some constructs can break/prevent that.

 

I agree with @data_null__: share a specific example of what you're seeing.

It's time to register for SAS Innovate! Join your SAS user peers in Las Vegas on April 16-19 2024.
ballardw
Super User

@Zelazny7 wrote:

Is there a design reason that SAS does not create an implicit keep statement for procedures that specify variables? For example, the proc freq tables statement lists all the variables used in the procedure. Yet there is a significant speedup when I add a keep statement specifying those variables. Why doesn't SAS implicitly create this keep statment?

 

 


Proc Freq only lists all of the variables if you don't supply them on a TABLES statement.

See the difference between

Proc freq data=sashelp.class;

run;

 

and

Proc freq data=sashelp.class;

   tables sex age;

run;

Steelers_In_DC
Barite | Level 11

Ballard, I think what they are saying is that SAS still reads in all of the variables, even if you only put 1 variable in the table statement.

Rick_SAS
SAS Super FREQ

One "design reason" is that many procedures have an OUTPUT statement that enables you to create an output data set that contains ALL the input variables AND the created variables. For example, in PROG REG, if you say

OUTPUT out=MYOUT P=Pred R=Resid;

then the output data set contains all input variables in addition to the predicted and residual variables from the regression.

Because the output from one procedure is often used as the input to another procedure, this prevents doing a separate MERGE between procedure calls.

 

Furthermore, PROC REG is an interactive procedure, so you can specify a model, then execute the RUN statement.  After the model has run, you can specify the OUTPUT statement to get the output data set.  In other words, the procedure does not know when it encounters the first RUN statement whether there will be an OUPUT statement later in the program.

 

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

What is Bayesian Analysis?

Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 6 replies
  • 1118 views
  • 2 likes
  • 7 in conversation