I have the following data, where some of the quantities are missing. I want to replace them by a rolling 3 month average of previous 3 months. I've tried to explain what I'm trying to achieve here-
ID | Month | Qty | Rolling average | Rolling average value |
1 | Jan | 2 | - | - |
1 | Feb | 3 | 2 | 2 |
1 | Mar | 2 | average(3,2) | 2.5 |
1 | Apr | 6 | average(2,3,2) | 2.333333333 |
1 | May | 4 | average(6,2,3) | 3.666666667 |
1 | Jun | 5 | AVERAGE(4,6,2) | 4 |
1 | Jul | . | average(5,4,6) | 5 |
1 | Aug | . | average(5,5,4) | 4.666666667 |
1 | Sep | . | average(4.666667,5,5) | 4.888888889 |
1 | Oct | . | average(4.8888889,4.666667,5) | 4.851851852 |
1 | Nov | . | average(4.8518519, 4.888889, 4.666667) | 4.802469136 |
1 | Dec | . | average(4.8024691, 4.8518519, 4.8888889) | 4.847736626 |
2 | Jan | 5 | ||
2 | Feb | 1 | ||
2 | Mar | 4 | ||
2 | Apr | 3 | ||
2 | May | 4 | ||
2 | Jun | 6 | ||
2 | Jul | . | ||
2 | Aug | . | ||
2 | Sep | . | ||
2 | Oct | . | ||
2 | Nov | . | ||
2 | Dec | . |
Can anyone please help on how to do this in SAS?
Try this and replace as you like.
data have;
input ID Month $ Qty;
datalines;
1 Jan 2
1 Feb 3
1 Mar 2
1 Apr 6
1 May 4
1 Jun 5
1 Jul .
1 Aug .
1 Sep .
1 Oct .
1 Nov .
1 Dec .
2 Jan 5
2 Feb 1
2 Mar 4
2 Apr 3
2 May 4
2 Jun 6
2 Jul .
2 Aug .
2 Sep .
2 Oct .
2 Nov .
2 Dec .
;
data want;
array lag[0:2] _temporary_;
call missing(of lag[*]);
do _N_ = 1 by 1 until (last.id);
set have;
by id;
output;
lag[mod(_N_, 3)] = coalesce(Qty, avg);
avg = mean(of lag[*]);
end;
run;
Result:
ID Month Qty avg 1 Jan 2 . 1 Feb 3 2 1 Mar 2 2.5 1 Apr 6 2.3333333333 1 May 4 3.6666666667 1 Jun 5 4 1 Jul . 5 1 Aug . 4.6666666667 1 Sep . 4.8888888889 1 Oct . 4.8518518519 1 Nov . 4.8024691358 1 Dec . 4.8477366255 2 Jan 5 . 2 Feb 1 5 2 Mar 4 3 2 Apr 3 3.3333333333 2 May 4 2.6666666667 2 Jun 6 3.6666666667 2 Jul . 4.3333333333 2 Aug . 4.7777777778 2 Sep . 5.037037037 2 Oct . 4.7160493827 2 Nov . 4.8436213992 2 Dec . 4.865569273
Try this and replace as you like.
data have;
input ID Month $ Qty;
datalines;
1 Jan 2
1 Feb 3
1 Mar 2
1 Apr 6
1 May 4
1 Jun 5
1 Jul .
1 Aug .
1 Sep .
1 Oct .
1 Nov .
1 Dec .
2 Jan 5
2 Feb 1
2 Mar 4
2 Apr 3
2 May 4
2 Jun 6
2 Jul .
2 Aug .
2 Sep .
2 Oct .
2 Nov .
2 Dec .
;
data want;
array lag[0:2] _temporary_;
call missing(of lag[*]);
do _N_ = 1 by 1 until (last.id);
set have;
by id;
output;
lag[mod(_N_, 3)] = coalesce(Qty, avg);
avg = mean(of lag[*]);
end;
run;
Result:
ID Month Qty avg 1 Jan 2 . 1 Feb 3 2 1 Mar 2 2.5 1 Apr 6 2.3333333333 1 May 4 3.6666666667 1 Jun 5 4 1 Jul . 5 1 Aug . 4.6666666667 1 Sep . 4.8888888889 1 Oct . 4.8518518519 1 Nov . 4.8024691358 1 Dec . 4.8477366255 2 Jan 5 . 2 Feb 1 5 2 Mar 4 3 2 Apr 3 3.3333333333 2 May 4 2.6666666667 2 Jun 6 3.6666666667 2 Jul . 4.3333333333 2 Aug . 4.7777777778 2 Sep . 5.037037037 2 Oct . 4.7160493827 2 Nov . 4.8436213992 2 Dec . 4.865569273
Depending on your real data this discussion might provide all the answers you need.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.