turn on suggestions

Auto-suggest helps you quickly narrow down your search results by suggesting possible matches as you type.

Showing results for

Find a Community

- Home
- /
- SAS Programming
- /
- Base SAS Programming
- /
- the rolling prior five-year average computation

Topic Options

- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

12-07-2015 08:48 PM

Hi,

The following is the datset I have.

gvkey fyear rds

0001 1980 0.05

0001 1981 0.02

0001 1982 0.03

0001 1983 0.04

0001 1984 0.04

0001 1985 0.05

0001 1986 0.06

I would like to compute the rolling prior five-year average for variable 'rds'. In other words, in 1986, average rds of prior five year is (rds in1985 + rds in 1984+rds in 1983+rds in 1982+rds in 1981) should be computed, and in 1985, average rds of prior five year is (rds in 1984 + rds in 1983 + rds in 1982 + rds in 1981 + rds in 1980) should be computed, and so on. And I also want to require at least three non-missing observations per gvkey. In this example, in 1983, average of rds of prior three year is (rds in 1982 + rds in 1981 + rds in 1980) can be computed. (not 1982, 1981). Any advice will be highly appreciated.

proc sql: create table want as

select *, select ave(rds) from have where fyear between intnx ('fyear',-5) and ('fyear',-1) from have as rds5

Accepted Solutions

Solution

12-09-2015
09:58 PM

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

Posted in reply to PGStats

12-09-2015 09:58 PM

All Replies

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

Posted in reply to kimx0961

12-07-2015 09:14 PM

Do you have SAS/ETS? If so, Proc Expand is a good solution. If not, you can use a temporary array solution:

https://communities.sas.com/message/244232

https://communities.sas.com/message/244232

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

Posted in reply to Reeza

12-07-2015 10:33 PM

Thank you for your prompt reply

According to my search, PROC EXPAND 'movave' option computes the rolling five-year mean using current year value and values of prior years (lag4, lag3,lag2,lag1 values). My example needs to compute the folling prior five-year mean using values of lag5, lag4, lag3, lag2, and lag1. It wll be appreicated if you advise me more in detail about array statement.

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

Posted in reply to kimx0961

12-07-2015 10:56 PM - edited 12-08-2015 04:32 AM

```
data want;
array p{0:4} _temporary_;
set have;
by object;
if first.object then call missing(of p{*});
if n(of p{*})>=3 then average = mean(of p{*});
p{mod(_n_,5)} = price;
run;
```

EDIT: Modified to account for 3+ years of data present.

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

Posted in reply to kimx0961

12-07-2015 11:02 PM

Can be done efficiently with arrays but you have to worry about missing years. Not a problem with SQL:

```
data have;
input gvkey fyear rds;
datalines;
0001 1980 0.05
0001 1981 0.02
0001 1982 0.03
0001 1983 0.04
0001 1984 0.04
0001 1985 0.05
0001 1986 0.06
;
proc sql;
create table want as
select
a.gvkey,
a.fyear,
a.rds,
case
when count(b.rds) >= 3 then mean(b.rds)
else .
end as rds5y
from
have as a left join
have as b
on a.gvkey=b.gvkey and
b.fyear between a.fyear-5 and a.fyear-1
group by a.gvkey, a.fyear, a.rds;
select * from want;
quit;
```

PG

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

Posted in reply to PGStats

12-09-2015 05:50 PM

Thank you so much for your help and time!

Your program worked successfully. Can I ask one more question?

I have six dataset with over 200,000 observations like the following and want to merge these six datasets efficiently.

dataset 1:gvkey fyear rds

dataset 2:gvkey fyear emp

dataset 3:gvkey fyear emps

dataset 4:gvkey fyear ddd

dataset 5:gvkey fyear eee

dataset 6:gvkey fyear aaa

wanted integrated dataset: gvkey fyear rds emp emps ddd eee aaa

Using 'set' and 'merge' statements don't work well.

Thank you again for your help

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

Posted in reply to kimx0961

12-09-2015 06:27 PM

MERGE should work well for joining your datasets once they are sorted. What do you mean by "don't work well". What code did you try? What problem did you encounter?

data all;

merge dataset1 dataset2 dataset3 dataset4 dataset5 dataset6;

by gvkey fyear;

run;

PG

Solution

12-09-2015
09:58 PM

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

Posted in reply to PGStats

12-09-2015 09:58 PM

Thank you very much

It works well.

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

Posted in reply to kimx0961

12-09-2015 10:23 PM

Generally you mark the correct answer as correct, not your response....

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

Posted in reply to kimx0961

12-09-2015 09:34 PM

You should mark this question answered and post a new question, which people would be happy to help with