BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
Durlov
Obsidian | Level 7

Hello, can someone help me understand the reasons the Answer is 2.

 

Given the SAS data set WORK.PRODUCT 

 

(See attached image).

 

I thought the answer would be 6 because as far as I understand it, dropping the variable only removes the variable from the output and not from the actual data set. So previously Work.Products had 5 variable. Work.Revenue inherited these 5 variable and added one more (Revenue). So that makes 6. Why am I wrong?


SAS Question.PNG
1 ACCEPTED SOLUTION

Accepted Solutions
9 REPLIES 9
Reeza
Super User
Dropping a variable removes it from the data set.

I'm not sure how you're differentiating the output from the final dataset, but in this case they're the same thing.

The correct answer is 2 -
->Source Data = 4 variables
+->calculate a new variable = 5 variables
->Drop 3 variables = 2 variables
Durlov
Obsidian | Level 7

So the following question led to my confusion:

 

Given what you know about how SAS processes the DROP and KEEP statements, would these two DATA steps create the same data set?

data work.subset1;
set orion.sales;
drop Salary;
Bonus=500;
Compensation=sum(Salary,Bonus);
BonusMonth=month(Hire_Date);
run;
data work.subset1;
set orion.sales;
Bonus=500;
Compensation=sum(Salary,Bonus);
BonusMonth=month(Hire_Date);
drop Salary;
run;
a. Yes
b. No

 

Correct answer: a

Variables in the DROP statement are dropped during output, so they're available for calculations in the DATA step, even if they follow the statements that reference them.

 

If variables can still be used in the calculations, wouldn't that mean they still exist in the data set? Thanks much!

Astounding
PROC Star

Here's a simplified way of looking at the processing within a DATA step.

 

Variables get copied from the input data set into memory.

 

Calculations get performed on the values in memory.

 

Results get copied from memory to the output data set.

 

When a DROP statement appears, that affects the final step, copying from memory to the output data set.  It is not necessary to copy all variables to the output.  Even a variable that was used in calculations can be dropped from the output data set.

 

If you are going to research this a bit more on your own, "in memory" would refer to the PDV (Program Data Vector).

 

Good luck.

Durlov
Obsidian | Level 7
So just to be clear, after the "copying from memory to the output data set occurs", the variables are permanently dropped from the data set, right? For example, you can't drop a variable in a previous data step and then use that same variable for calculations in the next data step (assuming you're using the same data set for both data steps). Is that correct?
Reeza
Super User
That's correct 🙂
Astounding
PROC Star

Just to get nit-picky about a small detail, when a variable is dropped it never appears in the output data set.  "DROP" means when copying from memory to the output data set, don't bother to copy this variable.

Durlov
Obsidian | Level 7

Yeah, thanks much for your help. I think I get it now (for some reason I can't remove the Italics). The 2nd question I posted confused me into thinking that the variable actually stays on, but just isn't displayed in the output.

Reeza
Super User

Your making this too hard.

 

There's data available to be used while creating a data set

and then there's data that's contained in the final output set. They are two different things. Otherwise what's the point of a DROP statement, if the variable is still there in the end?

Tanmay
Calcite | Level 5

Out put data set will contain two variables

1 + 1 (newly created variable) = 2

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 9 replies
  • 2598 views
  • 3 likes
  • 4 in conversation