01-05-2016 11:12 AM
Hello, can someone help me understand the reasons the Answer is 2.
Given the SAS data set WORK.PRODUCT
(See attached image).
I thought the answer would be 6 because as far as I understand it, dropping the variable only removes the variable from the output and not from the actual data set. So previously Work.Products had 5 variable. Work.Revenue inherited these 5 variable and added one more (Revenue). So that makes 6. Why am I wrong?
01-05-2016 11:25 AM
01-05-2016 12:16 PM
So the following question led to my confusion:
Given what you know about how SAS processes the DROP and KEEP statements, would these two DATA steps create the same data set?
Correct answer: a
Variables in the DROP statement are dropped during output, so they're available for calculations in the DATA step, even if they follow the statements that reference them.
If variables can still be used in the calculations, wouldn't that mean they still exist in the data set? Thanks much!
01-05-2016 12:49 PM
Here's a simplified way of looking at the processing within a DATA step.
Variables get copied from the input data set into memory.
Calculations get performed on the values in memory.
Results get copied from memory to the output data set.
When a DROP statement appears, that affects the final step, copying from memory to the output data set. It is not necessary to copy all variables to the output. Even a variable that was used in calculations can be dropped from the output data set.
If you are going to research this a bit more on your own, "in memory" would refer to the PDV (Program Data Vector).
01-05-2016 01:20 PM
01-05-2016 01:36 PM
Just to get nit-picky about a small detail, when a variable is dropped it never appears in the output data set. "DROP" means when copying from memory to the output data set, don't bother to copy this variable.
01-05-2016 02:06 PM
Yeah, thanks much for your help. I think I get it now (for some reason I can't remove the Italics). The 2nd question I posted confused me into thinking that the variable actually stays on, but just isn't displayed in the output.
01-05-2016 12:49 PM
Your making this too hard.
There's data available to be used while creating a data set
and then there's data that's contained in the final output set. They are two different things. Otherwise what's the point of a DROP statement, if the variable is still there in the end?
Need further help from the community? Please ask a new question.