SAS Data Integration Studio, DataFlux Data Management Studio, SAS/ACCESS, SAS Data Loader for Hadoop and others

how to calculate a rolling average

Reply
Occasional Contributor
Posts: 5

how to calculate a rolling average

Hi

I'm hoping someone can help me solve this.  I've been looking up proc expand as a possible solution, but haven't had any luck.

I'm trying to create a moving average, but my data has some gaps and I want the moving average to reset each time I get to a new ID.  Each ID should have 10 weeks of data, but sometimes, the information isn't complete (e.g missing weeks of data)... so my data set looks something like this:

IDWeek Count
D1113
D1211
D139
D148
D1510
D169
D175
D187
C121
C144
C155
C172
C183
C195
C1102
E137
E159
E167
E185
E197

The resulting data set I'm trying to get is this.  Where all IDs have 10 weeks, and when there is a week missing, the Count value is set to zero, and the average is based on the the 4 previous values (e.g. average for week 4 for ID D1 is (13+11+9+8)/4 .  But once I get to a new ID, I'd like the average to re-set.

IDWeek CountAverage
D1113.
D1211.
D139.
D14810
D151010
D1699
D1758
D1878
D1905
D11003
C1102
C121.
C130.
C144.
C1553
C1602
C1723
C1833
C1953
C11023
E110.
E120.
E137.
E1402
E1594
E1676
E1704
E1855
E1975
E11003

Thanks in advance

Super User
Posts: 19,822

Re: how to calculate a rolling average

Posted in reply to brenda023

First you'll need to create a time series with no missing values and then you can use either a data step with the lag function if it's only four time periods or you could use proc expand. Both methods are illustrated below:

data have;

input ID $ Week Count;

cards;

D1 1 13

D1 2 11

D1 3 9

D1 4 8

D1 5 10

D1 6 9

D1 7 5

D1 8 7

C1 2 1

C1 4 4

C1 5 5

C1 7 2

C1 8 3

C1 9 5

C1 10 2

E1 3 7

E1 5 9

E1 6 7

E1 8 5

E1 9 7

;

run;

data weeks;

do week=1 to 10;

output;

end;

run;

proc sql;

create table step1 as

select t1.id, t1.week, coalesce(t2.count, 0) as count

from (

select distinct a.id, b.week

from have as a

cross join weeks as b) as t1

left join have as t2

on t1.id=t2.id

and t1.week=t2.week

order by t1.id, t1.week;

quit;

data want1;

  set step1;

  by id;

  if first.id then n_count=1;

  else n_count+1;

  lag1=lag1(count);

  lag2=lag2(count);

  lag3=lag3(count);

  if n_count>=4 then rolling_avg=(count+lag1+lag2+lag3)/4;

  else rolling_avg=.;

  drop lag:;

run;

proc expand data=step1 out=want2;

by id;

id week;

convert count=rolling_avg4 / transformout=(movave 4 trimleft 3);

run;

Trusted Advisor
Posts: 1,228

Re: how to calculate a rolling average

Posted in reply to brenda023

data base;

do id = 'D1','C1','E1';

do week=1 to 10;

output;

end;

end;

run;

proc sql;

create table want as

select b.id,b.week,coalesce(a.count,0) as count

from base b left join have a

on b.id=a.id

and b.week=a.week;

quit;

data final;

set want;

by id;

four_week_moving_avg=(count+lag(count)+lag2(count)+lag3(count))/4;

format four_week_moving_avg 3.0;

if week < 4 then four_week_moving_avg=.;

run;

proc print data=final;

run;

Super User
Posts: 10,035

Re: how to calculate a rolling average

Posted in reply to brenda023
data have;
input ID $ Week Count;
cards;
D1 1 13
D1 2 11
D1 3 9
D1 4 8
D1 5 10
D1 6 9
D1 7 5
D1 8 7
C1 2 1
C1 4 4
C1 5 5
C1 7 2
C1 8 3
C1 9 5
C1 10 2
E1 3 7
E1 5 9
E1 6 7
E1 8 5
E1 9 7
;
run;

proc sql;
create table want as
select x.*,case when week lt 4 then .
              else (select sum(count)/4 from have where id=x.id and week between x.week-3 and x.week)
           end as average
from (
     select t1.id, t1.week, coalesce(t2.count, 0) as count
      from ( select * from (select distinct id from have),(select distinct week from have) ) as t1
       left join have as t2 on t1.id=t2.id and t1.week=t2.week 
     ) as x;
quit;

Xia Keshan

Ask a Question
Discussion stats
  • 3 replies
  • 864 views
  • 0 likes
  • 4 in conversation