turn on suggestions

Auto-suggest helps you quickly narrow down your search results by suggesting possible matches as you type.

Showing results for

Find a Community

- Home
- /
- Data Management
- /
- Forum
- /
- how to calculate a rolling average

Topic Options

- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

12-17-2014 09:22 PM

Hi

I'm hoping someone can help me solve this. I've been looking up proc expand as a possible solution, but haven't had any luck.

I'm trying to create a moving average, but my data has some gaps and I want the moving average to reset each time I get to a new ID. Each ID should have 10 weeks of data, but sometimes, the information isn't complete (e.g missing weeks of data)... so my data set looks something like this:

ID | Week | Count |

D1 | 1 | 13 |

D1 | 2 | 11 |

D1 | 3 | 9 |

D1 | 4 | 8 |

D1 | 5 | 10 |

D1 | 6 | 9 |

D1 | 7 | 5 |

D1 | 8 | 7 |

C1 | 2 | 1 |

C1 | 4 | 4 |

C1 | 5 | 5 |

C1 | 7 | 2 |

C1 | 8 | 3 |

C1 | 9 | 5 |

C1 | 10 | 2 |

E1 | 3 | 7 |

E1 | 5 | 9 |

E1 | 6 | 7 |

E1 | 8 | 5 |

E1 | 9 | 7 |

The resulting data set I'm trying to get is this. Where all IDs have 10 weeks, and when there is a week missing, the Count value is set to zero, and the average is based on the the 4 previous values (e.g. average for week 4 for ID D1 is (13+11+9+8)/4 . But once I get to a new ID, I'd like the average to re-set.

ID | Week | Count | Average |

D1 | 1 | 13 | . |

D1 | 2 | 11 | . |

D1 | 3 | 9 | . |

D1 | 4 | 8 | 10 |

D1 | 5 | 10 | 10 |

D1 | 6 | 9 | 9 |

D1 | 7 | 5 | 8 |

D1 | 8 | 7 | 8 |

D1 | 9 | 0 | 5 |

D1 | 10 | 0 | 3 |

C1 | 1 | 0 | 2 |

C1 | 2 | 1 | . |

C1 | 3 | 0 | . |

C1 | 4 | 4 | . |

C1 | 5 | 5 | 3 |

C1 | 6 | 0 | 2 |

C1 | 7 | 2 | 3 |

C1 | 8 | 3 | 3 |

C1 | 9 | 5 | 3 |

C1 | 10 | 2 | 3 |

E1 | 1 | 0 | . |

E1 | 2 | 0 | . |

E1 | 3 | 7 | . |

E1 | 4 | 0 | 2 |

E1 | 5 | 9 | 4 |

E1 | 6 | 7 | 6 |

E1 | 7 | 0 | 4 |

E1 | 8 | 5 | 5 |

E1 | 9 | 7 | 5 |

E1 | 10 | 0 | 3 |

Thanks in advance

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

Posted in reply to brenda023

12-17-2014 10:33 PM

First you'll need to create a time series with no missing values and then you can use either a data step with the lag function if it's only four time periods or you could use proc expand. Both methods are illustrated below:

**data** have;

input ID $ Week Count;

cards;

D1 1 13

D1 2 11

D1 3 9

D1 4 8

D1 5 10

D1 6 9

D1 7 5

D1 8 7

C1 2 1

C1 4 4

C1 5 5

C1 7 2

C1 8 3

C1 9 5

C1 10 2

E1 3 7

E1 5 9

E1 6 7

E1 8 5

E1 9 7

;

**run**;

**data** weeks;

do week=**1** to **10**;

output;

end;

**run**;

**proc** **sql**;

create table step1 as

select t1.id, t1.week, coalesce(t2.count, **0**) as count

from (

select distinct a.id, b.week

from have as a

cross join weeks as b) as t1

left join have as t2

on t1.id=t2.id

and t1.week=t2.week

order by t1.id, t1.week;

**quit**;

**data** want1;

set step1;

by id;

if first.id then n_count=**1**;

else n_count+**1**;

lag1=lag1(count);

lag2=lag2(count);

lag3=lag3(count);

if n_count>=**4** then rolling_avg=(count+lag1+lag2+lag3)/**4**;

else rolling_avg=**.**;

drop lag:;

**run**;

**proc** **expand** data=step1 out=want2;

by id;

id week;

convert count=rolling_avg4 / transformout=(movave **4** trimleft **3**);

**run**;

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

Posted in reply to brenda023

12-17-2014 11:35 PM

data base;

do id = 'D1','C1','E1';

do week=1 to 10;

output;

end;

end;

run;

proc sql;

create table want as

select b.id,b.week,coalesce(a.count,0) as count

from base b left join have a

on b.id=a.id

and b.week=a.week;

quit;

data final;

set want;

by id;

four_week_moving_avg=(count+lag(count)+lag2(count)+lag3(count))/4;

format four_week_moving_avg 3.0;

if week < 4 then four_week_moving_avg=.;

run;

proc print data=final;

run;

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

Posted in reply to brenda023

12-18-2014 07:28 AM

data have; input ID $ Week Count; cards; D1 1 13 D1 2 11 D1 3 9 D1 4 8 D1 5 10 D1 6 9 D1 7 5 D1 8 7 C1 2 1 C1 4 4 C1 5 5 C1 7 2 C1 8 3 C1 9 5 C1 10 2 E1 3 7 E1 5 9 E1 6 7 E1 8 5 E1 9 7 ; run; proc sql; create table want as select x.*,case when week lt 4 then . else (select sum(count)/4 from have where id=x.id and week between x.week-3 and x.week) end as average from ( select t1.id, t1.week, coalesce(t2.count, 0) as count from ( select * from (select distinct id from have),(select distinct week from have) ) as t1 left join have as t2 on t1.id=t2.id and t1.week=t2.week ) as x; quit;

Xia Keshan