Hello all,
I want to run out-of-sample forecasts with rolling regressions for N countries from September 2008 to August 2017 *************;
********************************************************* Data Set *****************************************
* My dataset is panel monthly data with 10 countries
* Sample period: From December 1996 to August 2017.
Here is an example to show my data set.
Date Country Y X Number Observation
12/31/1996 Australia 1 1
01/31/1997 Australia 1 2
......
.....
.....
08/31/2017 Australia 1 250
12/31/1996 Canada 2 1
........
.......
08/31/2017 Australia 2 250
......
.....
......
12/31/1996 UK 10 1
.......
......
......
08/31/2017 UK 10 250
****************** Modified Data set ********************************************
I modified the above data set by creating a new date variable called 'Rankdate' that shows the ending date in each rolling regression, as follows:
Rankdate Date Country Y X Number Observations
August 2008 12/31/1996 Australia 1 1
August 2008 01/31/1997 Australia 1 2
......
August 2008 08/31/2008 Australia 1
September 2008 01/31/1997 Australia 1
.....
September 2008 09/30/2008 Australia 1
...................
.....................
August 2017 12/31/2005 Australia 1
............
August 2017 08/30/2017 Australia 1
August 2008 12/31/1996 Canada 2
........
.......
......
.....
......
August 2017 08/31/2017 UK 10
************************************ Goal: Run Recursive Regression ************************************
* I want to run several recursive regressions for EACH country (i.e. keeps the starting date (i.e., December 1996) fixed,
and then adding an observation to the end of the sample with every run of the regression).
More specifically, I run the following sas code:
PROC UCM DATA = ma.developingrank(where=(country='CZ')); BY rankdate;
ID DATE INTERVAL=MONTH;
MODEL RETURN = MA;
IRREGULAR;
level var=0;
estimate back=1 outest=ma.OOSRollEst1;*The default is BACK=0,
which means that the forecast starts at the end of the available data;
FORECAST back=1 LEAD=1 outfor=ma.OOSRollRes1 plot=forecasts;
*forecast span;*This reports the one-step ahead out-of-sample forecast;
RUN;
My problem, however, is that the above code gives me forecast for each rolling window, and I don't know which model to report in my paper, and HOW?
Thanks for your help in advance.
Your question has many parts. I am going to answer the part about the computation of rolling forecasts. What to do with these rolling forecasts is up to you.
Currently there is no option in the UCM procedure to produce rolling forecasts. Until such an option becomes available, one must resort to repeated calls to UCM (quite inefficient and tedious). I am going to illustrate one such way.
First some notation. Suppose we have N measurements on a response variable y: y_1, y_2, ..., y_N. For h >= 1, let F(t,h) denote the h-step-ahead forecast of y_(t+h) using data y_1, y_2, ..., y_t. When (t+h) <= N, let E(t,h) = y_(t+h) - F(t,h) denote the (in-sample) h-step-ahead residual. The UCM procedure provides one-step-ahead (h=1) forecasts and residuals, F(t,1) and E(t,1), for many time instances t within the historical period by default. However, the in-sample multi-step-ahead forecasts and residuals
(F(t,h) and E(t,h) for h > 1) are NOT provided for different time instances. The computation of rolling forecasts (and residuals) involves computation of F(t,h) and E(t,h) for various t and h combinations where (t+h) <= N. Unfortunately, at the moment there is no simple option to output all possible rolling forecasts in a single UCM call. On the other hand, you can make repeated UCM calls with different holdout periods (BACK= option in the FORECAST statement) to get the necessary rolling forecasts. When you use BACK=k, you get F(N-k, h) and E(N-k,h) for h=1, 2,..,k. Of course, this is not a very efficient way to get the rolling forecasts. But at the moment this is the simplest way I can think of. The "roll" macro given below provides an illustration. The "roll" macro makes repeated calls to UCM to create a large data set--finalFor--that contains the rolling forecasts. Actually, finalFor must be processed further to extract the rolling forecasts. For example, the last k values of for k column in finalFor correspond to F(N-k, h) for h=1, 2, ..,k. You can then rearrange these numbers: for example, collect all 3-step ahead forecasts in a new data set. It does get tedious but the necessary numbers are in the output table finalFor.
/*------- illustration ---------*/
/* estimate model parameters */
proc ucm data=sashelp.air;
id date interval=month;
model air;
irregular;
level;
slope variance=0 noest;
run;
%let start = 5;
%let end = 6;
%macro roll;
%do k=&start %to &end; /* different values of BACK= option */
/* model parameters from the earlier call */
proc ucm data=sashelp.air;* noprint;
id date interval=month;
model air;
irregular variance=0.00027444 noest;
level variance=1139.35114 noest;
slope variance=0 noest;
forecast back=&k lead=&k outfor=for;
run;
data rfor;
set for(keep=date air forecast residual std);
for_&k = forecast;
std_&k = std;
residual_&k = residual;
run;
%if &k = 5 %then %do;
data finalFor;
set rfor;
run;
%end;
%else %do;
data finalFor;
merge rfor finalFor;
run;
%end;
%end;
%mend;
%roll;
run;
Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!
Learn how to run multiple linear regression models with and without interactions, presented by SAS user Alex Chaplin.
Find more tutorials on the SAS Users YouTube channel.