turn on suggestions

Auto-suggest helps you quickly narrow down your search results by suggesting possible matches as you type.

Showing results for

Find a Community

- Home
- /
- SAS Programming
- /
- Base SAS Programming
- /
- creating a macro variable within a data step

Topic Options

- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

03-22-2017 12:31 PM - edited 03-22-2017 12:34 PM

suppsoe that I have the following data

company | date diff | price |

A | -2 | 2 |

A | -1 | 5 |

A | 0 | 3 |

A | 1 | 1 |

A | 2 | 7 |

What I would like to do is to find the "date diff" = 0, and take the price of that "date diff" and subtract it from all the other prices. So here I need to subtract 3 from all the other prices and get the following:

company | date diff | price | price diff |

A | -2 | 2 | -1 |

A | -1 | 5 | 2 |

A | 0 | 3 | 0 |

A | 1 | 1 | -2 |

A | 2 | 7 | 4 |

Of course I can by inspection find the price of the date diff = 0 and then subtract it from the other prices, but I would like to have a dynamic model that can be used form many companies (A, B, C ....). That is why I imagine that somehow the price = 3 can be put into a macro variable and then subtracted from the other prices to create price diff, but I might be completely wrong, so please let me know if there is another way.

Thanks!

Accepted Solutions

Solution

03-22-2017
01:17 PM

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

03-22-2017 01:08 PM

If your data set is already sorted by COMPANY, a single DATA step will do:

data want;

do until (last.company);

set have;

by company;

if date_diff=0 then max_price=price;

end;

do until (last.company);

set have;

by company;

price_diff = price - max_price;

output;

end;

drop max_price; /* optional, of course */

run;

The top loop finds the maximum PRICE value. The bottom loop reads the same observations, calculates, and outputs.

All Replies

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

03-22-2017 12:34 PM

Do you have multiple companies or 0's that you need to find or a single one?

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

03-22-2017 12:35 PM

many companies, each one having their own date diff = 0

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

03-22-2017 12:38 PM

Then @LinusH solution is correct, a single data step to start may be easier IMO.

1. Calculate Diff - if 0 then output record to dataset (data step)

2. Merge back with main data

3. Next calculation required

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

03-22-2017 12:36 PM

I would use SQL, first query to find the date different price. Then join on the original table and perform the subtract.

Data never sleeps

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

03-22-2017 12:41 PM

That is actually what I did originally, but since my data is big I thought that maybe there is a simpler way, but if this is the best solution for the case I will just stick to it

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

03-22-2017 01:11 PM

ilikesas wrote:

@ilikesas In the future, please note what you've tried in your initial question and include the code otherwise it's a waste of our time to suggest things that you already know.

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

03-22-2017 12:42 PM

The GROUP BY is not necessary for your sample data, but will account for new groups. This does depend on the full set of data being similar to what you provided.

```
data have;
input company $ date_diff price;
datalines;
A -2 2
A -1 5
A 0 3
A 1 1
A 2 7
;
proc sql;
CREATE TABLE want AS
SELECT *,
price - max((date_diff = 0) * price) AS price_diff
FROM have
GROUP BY company
ORDER BY company, date_diff;
quit;
```

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

03-22-2017 01:17 PM

thanks for the suggestion

`price - max((date_diff = 0) * price) AS price_diff`

becasue I thought of calculating the price_diff in several steps, but it can be done in one!

Solution

03-22-2017
01:17 PM

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

03-22-2017 01:08 PM

If your data set is already sorted by COMPANY, a single DATA step will do:

data want;

do until (last.company);

set have;

by company;

if date_diff=0 then max_price=price;

end;

do until (last.company);

set have;

by company;

price_diff = price - max_price;

output;

end;

drop max_price; /* optional, of course */

run;

The top loop finds the maximum PRICE value. The bottom loop reads the same observations, calculates, and outputs.

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

03-22-2017 01:30 PM

If you only have one record per COMPANY where DATE_DIFF=0 then you can just merge the file back with itself.

```
data have;
input company $ date_diff price @@ ;
cards;
A -2 2 A -1 5 A 0 3 A 1 1 A 2 7
;;;;
data want ;
merge have have(rename=(date_diff=x price=base_price) where=(x=0) );
by company ;
price_diff = price - base_price ;
drop x;
run;
```