turn on suggestions

Auto-suggest helps you quickly narrow down your search results by suggesting possible matches as you type.

Showing results for

Find a Community

- Home
- /
- SAS Programming
- /
- Base SAS Programming
- /
- Drop and Keep Question

Topic Options

- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

01-05-2016 11:12 AM

Hello, can someone help me understand the reasons the Answer is 2.

Given the SAS data set WORK.PRODUCT

(See attached image).

I thought the answer would be 6 because as far as I understand it, dropping the variable only removes the variable from the output and not from the actual data set. So previously Work.Products had 5 variable. Work.Revenue inherited these 5 variable and added one more (Revenue). So that makes 6. Why am I wrong?

Accepted Solutions

Solution

01-05-2016
01:29 PM

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to Durlov

01-05-2016 01:23 PM

That's correct

All Replies

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to Durlov

01-05-2016 11:25 AM

Dropping a variable removes it from the data set.

I'm not sure how you're differentiating the output from the final dataset, but in this case they're the same thing.

The correct answer is 2 -

->Source Data = 4 variables

+->calculate a new variable = 5 variables

->Drop 3 variables = 2 variables

I'm not sure how you're differentiating the output from the final dataset, but in this case they're the same thing.

The correct answer is 2 -

->Source Data = 4 variables

+->calculate a new variable = 5 variables

->Drop 3 variables = 2 variables

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to Reeza

01-05-2016 12:16 PM

So the following question led to my confusion:

Given what you know about how SAS processes the DROP and KEEP statements, would these two DATA steps create the same data set?

data work.subset1;

set orion.sales;

drop Salary;

Bonus=500;

Compensation=sum(Salary,Bonus);

BonusMonth=month(Hire_Date);

run;

data work.subset1;

set orion.sales;

Bonus=500;

Compensation=sum(Salary,Bonus);

BonusMonth=month(Hire_Date);

drop Salary;

run;

a. Yes

b. No

Correct answer: a

Variables in the DROP statement are dropped during output, so they're available for calculations in the DATA step, even if they follow the statements that reference them.

If variables can still be used in the calculations, wouldn't that mean they still exist in the data set? Thanks much!

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to Durlov

01-05-2016 12:49 PM

Here's a simplified way of looking at the processing within a DATA step.

Variables get copied from the input data set into memory.

Calculations get performed on the values in memory.

Results get copied from memory to the output data set.

When a DROP statement appears, that affects the final step, copying from memory to the output data set. It is not necessary to copy all variables to the output. Even a variable that was used in calculations can be dropped from the output data set.

If you are going to research this a bit more on your own, "in memory" would refer to the PDV (Program Data Vector).

Good luck.

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to Astounding

01-05-2016 01:20 PM

So just to be clear, after the "copying from memory to the output data set occurs", the variables are permanently dropped from the data set, right? For example, you can't drop a variable in a previous data step and then use that same variable for calculations in the next data step (assuming you're using the same data set for both data steps). Is that correct?

Solution

01-05-2016
01:29 PM

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to Durlov

01-05-2016 01:23 PM

That's correct

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to Durlov

01-05-2016 01:36 PM

Just to get nit-picky about a small detail, when a variable is dropped it never appears in the output data set. "DROP" means when copying from memory to the output data set, don't bother to copy this variable.

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to Astounding

01-05-2016 02:06 PM

Yeah, thanks much for your help. *I think I get it now (for some reason I can't remove the Italics). The 2nd question I posted confused me into thinking that the variable actually stays on, but just isn't displayed in the output.*

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to Durlov

01-05-2016 12:49 PM

Your making this too hard.

There's data available to be used while creating a data set

and then there's data that's contained in the final output set. They are two different things. Otherwise what's the point of a DROP statement, if the variable is still there in the end?

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to Reeza

01-06-2016 07:28 AM

Out put data set will contain two variables

1 + 1 (newly created variable) = 2