BookmarkSubscribeRSS Feed
brenda023
Calcite | Level 5

Hi

I'm hoping someone can help me solve this.  I've been looking up proc expand as a possible solution, but haven't had any luck.

I'm trying to create a moving average, but my data has some gaps and I want the moving average to reset each time I get to a new ID.  Each ID should have 10 weeks of data, but sometimes, the information isn't complete (e.g missing weeks of data)... so my data set looks something like this:

IDWeek Count
D1113
D1211
D139
D148
D1510
D169
D175
D187
C121
C144
C155
C172
C183
C195
C1102
E137
E159
E167
E185
E197

The resulting data set I'm trying to get is this.  Where all IDs have 10 weeks, and when there is a week missing, the Count value is set to zero, and the average is based on the the 4 previous values (e.g. average for week 4 for ID D1 is (13+11+9+8)/4 .  But once I get to a new ID, I'd like the average to re-set.

IDWeek CountAverage
D1113.
D1211.
D139.
D14810
D151010
D1699
D1758
D1878
D1905
D11003
C1102
C121.
C130.
C144.
C1553
C1602
C1723
C1833
C1953
C11023
E110.
E120.
E137.
E1402
E1594
E1676
E1704
E1855
E1975
E11003

Thanks in advance

3 REPLIES 3
Reeza
Super User

First you'll need to create a time series with no missing values and then you can use either a data step with the lag function if it's only four time periods or you could use proc expand. Both methods are illustrated below:

data have;

input ID $ Week Count;

cards;

D1 1 13

D1 2 11

D1 3 9

D1 4 8

D1 5 10

D1 6 9

D1 7 5

D1 8 7

C1 2 1

C1 4 4

C1 5 5

C1 7 2

C1 8 3

C1 9 5

C1 10 2

E1 3 7

E1 5 9

E1 6 7

E1 8 5

E1 9 7

;

run;

data weeks;

do week=1 to 10;

output;

end;

run;

proc sql;

create table step1 as

select t1.id, t1.week, coalesce(t2.count, 0) as count

from (

select distinct a.id, b.week

from have as a

cross join weeks as b) as t1

left join have as t2

on t1.id=t2.id

and t1.week=t2.week

order by t1.id, t1.week;

quit;

data want1;

  set step1;

  by id;

  if first.id then n_count=1;

  else n_count+1;

  lag1=lag1(count);

  lag2=lag2(count);

  lag3=lag3(count);

  if n_count>=4 then rolling_avg=(count+lag1+lag2+lag3)/4;

  else rolling_avg=.;

  drop lag:;

run;

proc expand data=step1 out=want2;

by id;

id week;

convert count=rolling_avg4 / transformout=(movave 4 trimleft 3);

run;

stat_sas
Ammonite | Level 13

data base;

do id = 'D1','C1','E1';

do week=1 to 10;

output;

end;

end;

run;

proc sql;

create table want as

select b.id,b.week,coalesce(a.count,0) as count

from base b left join have a

on b.id=a.id

and b.week=a.week;

quit;

data final;

set want;

by id;

four_week_moving_avg=(count+lag(count)+lag2(count)+lag3(count))/4;

format four_week_moving_avg 3.0;

if week < 4 then four_week_moving_avg=.;

run;

proc print data=final;

run;

Ksharp
Super User
data have;
input ID $ Week Count;
cards;
D1 1 13
D1 2 11
D1 3 9
D1 4 8
D1 5 10
D1 6 9
D1 7 5
D1 8 7
C1 2 1
C1 4 4
C1 5 5
C1 7 2
C1 8 3
C1 9 5
C1 10 2
E1 3 7
E1 5 9
E1 6 7
E1 8 5
E1 9 7
;
run;

proc sql;
create table want as
select x.*,case when week lt 4 then .
              else (select sum(count)/4 from have where id=x.id and week between x.week-3 and x.week)
           end as average
from (
     select t1.id, t1.week, coalesce(t2.count, 0) as count
      from ( select * from (select distinct id from have),(select distinct week from have) ) as t1
       left join have as t2 on t1.id=t2.id and t1.week=t2.week 
     ) as x;
quit;

Xia Keshan

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

How to connect to databases in SAS Viya

Need to connect to databases in SAS Viya? SAS’ David Ghan shows you two methods – via SAS/ACCESS LIBNAME and SAS Data Connector SASLIBS – in this video.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 3 replies
  • 3085 views
  • 0 likes
  • 4 in conversation