DATA Step, Macro, Functions and more

Drop and Keep Question

Accepted Solution Solved
Reply
Contributor
Posts: 24
Accepted Solution

Drop and Keep Question

Hello, can someone help me understand the reasons the Answer is 2.

 

Given the SAS data set WORK.PRODUCT 

 

(See attached image).

 

I thought the answer would be 6 because as far as I understand it, dropping the variable only removes the variable from the output and not from the actual data set. So previously Work.Products had 5 variable. Work.Revenue inherited these 5 variable and added one more (Revenue). So that makes 6. Why am I wrong?


SAS Question.PNG

Accepted Solutions
Solution
‎01-05-2016 01:29 PM
Super User
Posts: 17,819

Re: Drop and Keep Question


All Replies
Super User
Posts: 17,819

Re: Drop and Keep Question

Dropping a variable removes it from the data set.

I'm not sure how you're differentiating the output from the final dataset, but in this case they're the same thing.

The correct answer is 2 -
->Source Data = 4 variables
+->calculate a new variable = 5 variables
->Drop 3 variables = 2 variables
Contributor
Posts: 24

Re: Drop and Keep Question

So the following question led to my confusion:

 

Given what you know about how SAS processes the DROP and KEEP statements, would these two DATA steps create the same data set?

data work.subset1;
set orion.sales;
drop Salary;
Bonus=500;
Compensation=sum(Salary,Bonus);
BonusMonth=month(Hire_Date);
run;
data work.subset1;
set orion.sales;
Bonus=500;
Compensation=sum(Salary,Bonus);
BonusMonth=month(Hire_Date);
drop Salary;
run;
a. Yes
b. No

 

Correct answer: a

Variables in the DROP statement are dropped during output, so they're available for calculations in the DATA step, even if they follow the statements that reference them.

 

If variables can still be used in the calculations, wouldn't that mean they still exist in the data set? Thanks much!

Super User
Posts: 5,082

Re: Drop and Keep Question

Here's a simplified way of looking at the processing within a DATA step.

 

Variables get copied from the input data set into memory.

 

Calculations get performed on the values in memory.

 

Results get copied from memory to the output data set.

 

When a DROP statement appears, that affects the final step, copying from memory to the output data set.  It is not necessary to copy all variables to the output.  Even a variable that was used in calculations can be dropped from the output data set.

 

If you are going to research this a bit more on your own, "in memory" would refer to the PDV (Program Data Vector).

 

Good luck.

Contributor
Posts: 24

Re: Drop and Keep Question

So just to be clear, after the "copying from memory to the output data set occurs", the variables are permanently dropped from the data set, right? For example, you can't drop a variable in a previous data step and then use that same variable for calculations in the next data step (assuming you're using the same data set for both data steps). Is that correct?
Solution
‎01-05-2016 01:29 PM
Super User
Posts: 17,819

Re: Drop and Keep Question

That's correct Smiley Happy
Super User
Posts: 5,082

Re: Drop and Keep Question

Just to get nit-picky about a small detail, when a variable is dropped it never appears in the output data set.  "DROP" means when copying from memory to the output data set, don't bother to copy this variable.

Contributor
Posts: 24

Re: Drop and Keep Question

Yeah, thanks much for your help. I think I get it now (for some reason I can't remove the Italics). The 2nd question I posted confused me into thinking that the variable actually stays on, but just isn't displayed in the output.

Super User
Posts: 17,819

Re: Drop and Keep Question

Your making this too hard.

 

There's data available to be used while creating a data set

and then there's data that's contained in the final output set. They are two different things. Otherwise what's the point of a DROP statement, if the variable is still there in the end?

Occasional Contributor
Posts: 5

Re: Drop and Keep Question

Out put data set will contain two variables

1 + 1 (newly created variable) = 2

☑ This topic is SOLVED.

Need further help from the community? Please ask a new question.

Discussion stats
  • 9 replies
  • 357 views
  • 3 likes
  • 4 in conversation