Hello, can someone help me understand the reasons the Answer is 2.
Given the SAS data set WORK.PRODUCT
(See attached image).
I thought the answer would be 6 because as far as I understand it, dropping the variable only removes the variable from the output and not from the actual data set. So previously Work.Products had 5 variable. Work.Revenue inherited these 5 variable and added one more (Revenue). So that makes 6. Why am I wrong?
So the following question led to my confusion:
Given what you know about how SAS processes the DROP and KEEP statements, would these two DATA steps create the same data set?
data work.subset1;
set orion.sales;
drop Salary;
Bonus=500;
Compensation=sum(Salary,Bonus);
BonusMonth=month(Hire_Date);
run;
data work.subset1;
set orion.sales;
Bonus=500;
Compensation=sum(Salary,Bonus);
BonusMonth=month(Hire_Date);
drop Salary;
run;
a. Yes
b. No
Correct answer: a
Variables in the DROP statement are dropped during output, so they're available for calculations in the DATA step, even if they follow the statements that reference them.
If variables can still be used in the calculations, wouldn't that mean they still exist in the data set? Thanks much!
Here's a simplified way of looking at the processing within a DATA step.
Variables get copied from the input data set into memory.
Calculations get performed on the values in memory.
Results get copied from memory to the output data set.
When a DROP statement appears, that affects the final step, copying from memory to the output data set. It is not necessary to copy all variables to the output. Even a variable that was used in calculations can be dropped from the output data set.
If you are going to research this a bit more on your own, "in memory" would refer to the PDV (Program Data Vector).
Good luck.
Just to get nit-picky about a small detail, when a variable is dropped it never appears in the output data set. "DROP" means when copying from memory to the output data set, don't bother to copy this variable.
Yeah, thanks much for your help. I think I get it now (for some reason I can't remove the Italics). The 2nd question I posted confused me into thinking that the variable actually stays on, but just isn't displayed in the output.
Your making this too hard.
There's data available to be used while creating a data set
and then there's data that's contained in the final output set. They are two different things. Otherwise what's the point of a DROP statement, if the variable is still there in the end?
Out put data set will contain two variables
1 + 1 (newly created variable) = 2
Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.