03-22-2018 09:08 PM - last edited on 03-22-2018 11:28 PM by Reeza
I think Option A is correct.Please clarify.
A data set stored on a network drive has the following characteristics:
14 Million observations
400 numeric variables
0 character variables of length 20
A DATA Step query requires only 3 character and 15 numeric variables from this data set. What is the best way to reduce computer resource utilization in this DATA Step?
A. A KEEP= data set option used on the SET Statement
B. A KEEP Statement used within the DATA Step
C. A KEEP= data set option used on the DATA Statement
D. A DROP= data set option used on the DATA Statement
03-22-2018 09:16 PM
03-23-2018 08:44 AM
03-23-2018 08:04 AM
I think I agree with you that A is the best option. It is better than B or C because you do not spend CPU and memory on reading a lot of data that you do not need, and because you may want to create new variables in the data step. In which case you will not have to worry about the names of your new variables clashing with the names of existing, but unwanted, variables in the table read.
And options A is also better than option D. For two reasons:
03-25-2018 05:55 AM
03-25-2018 09:40 AM
The best answer is A.
With even partial knowledge of how data step really works you should realize that B,C,D and saying the same thing and so test taking skills should lead you pick A.
Perhaps the test randomizes the order of the choices and you are looking at the wrong answer key?