Hi Guys,
I need your help to suggest the code to create the 2 month moving average for proportion and value from 01/04/2019 to 01/08/2019.
Tried proc expand but it only computes the moving average based on the observed data (01-03, 2019, in this case) not the forecast data. Any simple clean code can be used to produce the values in the blank cells?
date Proportion Value
01/01/2019 0.1 3
01/02/2019 0.2 4
01/03/2019 0.3 7
01/04/2019
01/05/2019
01/06/2019
01/07/2019
01/08/2019
You don't need proc expand for something like this. Instead just provide one record per month, including records for the extended months, but with proportion for the last 3 months taking a missing value:
data have;
input date mmddyy10. proportion ;
format date date9. ;
datalines;
1/4/2019 0.1
1/5/2019 0.2
1/6/2019 0.3
1/7/2019 0.25
1/8/2019 0.5
1/9/2019
1/10/2019
1/11/2019
1/12/2019
run;
data want;
set have;
movavg=mean(coalesce(lag1(proportion),lag1(movavg))
,coalesce(lag2(proportion),lag2(movavg))
,coalesce(lag3(proportion),lag3(movavg))
);
run;
The movavg variables is the mean of 3 values, from the 3 preceding months. If you never had a missing value for proportion, you could do with
movavg=mean(lag1(proportion),lag2(proportion),lag3(proportion));
But the trailng months have no available lagged value for proportion. In those cases, you have told us you want the lagged value of the calculated moving average. That's why I used COALESCE(lagx(proportion),lagx(movavg)), which takes the Xth lag of proportion, except if it is missing, in which case it substitues the Xth lag of movavg.
@shasha11111 wrote:
Hi Guys,
I need your help to suggest the code to create the observations till 01Dec2024, with the 3 moving average is calculated for the dataset as below:
date Proportion
01/01/2019 0.1
01/02/2019 0.2
01/03/2019 0.3
? 3m moving average
?
?
It's not clear to me what goes on row 4 in column 2 where you have written 3m moving average. What goes in row 5 column 2? Please explain clearly what calculation goes there.
But there is no data after the third row.
The data after the third row is what I want to compute using the moving average. Have not figured out how to do it in SAS. Tried both proc expand a proc arima, does not work.
@shasha11111 wrote:
The data after the third row is what I want to compute using the moving average.
You want to compute data after the third row, using the moving average?? I still don't grasp the concept.
Is this what you want?
date proportion moving_average 1/1/2019 0.1 1/2/2019 0.2 1/3/2019 0.3 0.2 1/4/2019 0.25 0.25 1/5/2019 0.15 0.2333
where there is actual data under proportion at each date, and the moving average is in a separate column? The way you presented the data originally, and your words quoted above, indicate the moving average is in the same column as proportion, and no new data in column 2, only computed data; which doesn't make any sense to me.
Hi Paige,
Sorry for the confusion. Still new in posting questions here. Your answer is kind of what I need, but I also need to use the forecasted moving average data for the future month forecast, let me show your an example: for proportion data, I only have actuals till 1/8/2019, then I want to have the three month moving average for Sep'19 to Dec'19, using forecasted data as an input too.
date proportion moving_average 1/4/2019 0.1 1/5/2019 0.2 1/6/2019 0.3 1/7/2019 0.25 0.2 (average of 0.1, 0.2, 0.3) 1/8/2019 0.5 0.75 (average of 0.2, 0.3, 0.25)
1/9/2019 0.35 (average of 0.3, 0.25, 0.5)
1/10/2019 0.37 (average of 0.25, 0.5, 0.35(forecasted))
1/11/2019 0.41 (average of 0.5, 0.35(forecasted), 0.37(forecasted))
1/12/2019 0.38 (average of 0.35, 0.37, 0.41)
I tried proc expand, which gives exactly the result you offered. I also tried proc arima to compute the moving average. The interesting part by using proc arima is that, for the computed moving average, there will be a slightly difference comparing the result that I manually computed in excel (in 10e-4 level). Since this ratio is needed to multiply a big number (10 billion level), a slight difference will produce a big number after that.
I am wondering if any other simple way we can do the moving average with the forecasted data?
So you only have 5 data points, and then after that you use the moving average as a data point? It doesn't make any sense to do this.
You don't need proc expand for something like this. Instead just provide one record per month, including records for the extended months, but with proportion for the last 3 months taking a missing value:
data have;
input date mmddyy10. proportion ;
format date date9. ;
datalines;
1/4/2019 0.1
1/5/2019 0.2
1/6/2019 0.3
1/7/2019 0.25
1/8/2019 0.5
1/9/2019
1/10/2019
1/11/2019
1/12/2019
run;
data want;
set have;
movavg=mean(coalesce(lag1(proportion),lag1(movavg))
,coalesce(lag2(proportion),lag2(movavg))
,coalesce(lag3(proportion),lag3(movavg))
);
run;
The movavg variables is the mean of 3 values, from the 3 preceding months. If you never had a missing value for proportion, you could do with
movavg=mean(lag1(proportion),lag2(proportion),lag3(proportion));
But the trailng months have no available lagged value for proportion. In those cases, you have told us you want the lagged value of the calculated moving average. That's why I used COALESCE(lagx(proportion),lagx(movavg)), which takes the Xth lag of proportion, except if it is missing, in which case it substitues the Xth lag of movavg.
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.