DATA Step, Macro, Functions and more

SAS multiple choice question - Keep or Drop efficiency

Reply
Occasional Contributor
Posts: 8

SAS multiple choice question - Keep or Drop efficiency

[ Edited ]

Hi,

 

I think Option A is correct.Please clarify.

 

Thanks.

 

A data set stored on a network drive has the following characteristics:
14 Million observations
400 numeric variables
0 character variables of length 20
Binary compression
A DATA Step query requires only 3 character and 15 numeric variables from this data set. What is the best way to reduce computer resource utilization in this DATA Step?
A. A KEEP= data set option used on the SET Statement
B. A KEEP Statement used within the DATA Step
C. A KEEP= data set option used on the DATA Statement
D. A DROP= data set option used on the DATA Statement

SAS Super FREQ
Posts: 9,423

Re: BASE SAS

Are you using the Prep Guide or a Practice Exam? What does the answer key say?

Otherwise, you can solve this by understanding that KEEP= or DROP= dataset options on the SET, will restrict the variables that are loaded from the INPUT dataset, and thus, reduce the resources needed to load and manipulate the input data.

Any KEEP or DROP statement used within the DATA step program has no impact on the SET statement being read, so ALL the variables would be read in order to KEEP or DROP what you specified.

In a similar fashion, with the KEEP= or DROP= option on the DATA Statement, you are only impacting the OUTPUT file (not the INPUT file on the SET), so while you might save a bit by restricting the size of the OUTPUT data set, you're not saving anything on the INPUT data set, which is where you want to do your restriction. There is no point to reading in ALL the numeric and ALL the character variables for the few that you want to use.

cynthia
Occasional Contributor
Posts: 8

Re: BASE SAS

[ Edited ]
Posted in reply to Cynthia_sas

Thank you. I am using practice exam and key says Option D.Is that correct?

Super User
Posts: 23,928

Re: BASE SAS

No. 

SAS Super FREQ
Posts: 9,423

Re: BASE SAS

Hi:
If this is the Pearson VUE or SAS Practice Exam, then please send mail to training@sas.com and report the question. We would need to know the exact name of the exam you took, when you bought or took the exam and the question number to track it down.

If this is a practice exam from some other company, then you should report the error to them.

Thanks,
cynthia
PROC Star
Posts: 269

Re: SAS multiple choice question - Keep or Drop efficiency

I think I agree with you that A is the best option. It is better than B or C because you do not spend CPU and memory on reading a lot of data that you do not need, and because you may want to create new variables in the data step. In which case you will not have to worry about the names of your new variables clashing with the names of existing, but unwanted, variables in the table read.

And options A is also better than option D. For two reasons:

  1. It is easier to read. A DROP= option is like going to the baker's shop and listing alle the stuff you do not want. Easier to tell the baker what it actually is that you want. In other words, KEEP= is easier to read and maintain.
  2. If your input data changes (variables are dropped or added), a KEEP= option is safer: You will get a message in the log if a variable that you want has been dropped, and you will not automatically add new variables that you are not interested in.
Super User
Posts: 10,844

Re: SAS multiple choice question - Keep or Drop efficiency

I would pick up A.

Frequent Contributor
Posts: 79

Re: SAS multiple choice question - Keep or Drop efficiency

I think option A is the correct Answer.
The reason being, while doing set statement in the dataset, we only are keeping the required variables, others are not read and dropped.

However in Keep statement( option-B), it reads all variables and their values for 14 million obs and it then drop all others columns apart from.those mentioned in keep statement, while creating the final dataset.

So unnecessary wastage of memory and cpu for all other options.

Cheers from India!

Manjeet
Super User
Super User
Posts: 8,260

Re: SAS multiple choice question - Keep or Drop efficiency

The best answer is A.

 

With even partial knowledge of how data step really works you should realize that B,C,D and saying the same thing and so test taking skills should lead you pick A.

 

Perhaps the test randomizes the order of the choices and you are looking at the wrong answer key?

Ask a Question
Discussion stats
  • 8 replies
  • 278 views
  • 1 like
  • 7 in conversation