Forecasting using SAS Forecast Server, SAS/ETS, and more

Calculate predictions and compare with actual values

Accepted Solution Solved
Reply
Highlighted
New Contributor
Posts: 4
Accepted Solution

Calculate predictions and compare with actual values

Dear all,

 

I have sales data that is grouped by company and year. I would like to "forecast" sales data within the sample so that I could compare the predicted sales figures with the actual sales data. For example, the data spans from 1984 to 2015 for many companies and I would like to calculate the predicted sales with a exponential smoothing procedure in 1988 from actual sales of the four previous years (1984-1987) and continue this throughout all available years so I would have a sales value for 1988 (from 1984-1987), 1989 (from 1985-1988) and so forth. So far I have been able to forecast data for the "next" year (in this case 2016) but I do not know how to set the forecast window so that it would move in 4-year increments.

 

Thank you,

 

Pieter


Accepted Solutions
Solution
a week ago
Super User
Posts: 528

Re: Calculate predictions and compare with actual values

You could use proc expand or (I think) proc timedata to make rolling 4-year windows, but I don't recall the syntax off hand.  Also, you need to have a license for the SAS/ETS (econometrics/time series) package.

 

Here's a data step that modifies your data, so that instead of "by gvkey" you can use "by gvkey window", where window=1 for the first 4 years, window=2 for the 2nd through 5th years, etc.   That is they are rolling windows.  Here's the code:

 

 

DATA firm; 
   INPUT fyear gvkey sale; 
   DATALINES; 
1993 1013 366.1180
1994 1013 448.7350
1995 1013 586.2220
1996 1013 828.0090
1997 1013 1164.4500
1998 1013 1379.6780
1999 1013 1926.9470
2000 1013 3287.9000
2001 1013 2402.8000
2002 1013 1047.7000
2003 1013 773.2000
2004 1013 784.3000
2005 1013 1169.2000
2006 1013 1281.9000
2007 1013 1322.2000
2008 1013 1456.4000
2009 1013 996.7000
2010 1013 1156.6000
1993 1034 338.2300
1994 1034 469.2630
1995 1034 520.8820
1996 1034 486.1840
1997 1034 500.2880
1998 1034 604.5840
1999 1034 732.4430
2000 1034 900.7940
2001 1034 974.9900
2002 1034 1230.7620
2003 1034 1297.2850
2004 1034 1339.4800
2005 1034 553.6170
2006 1034 653.8280
2007 1034 722.4250
run;

data vneed / view=vneed;
  /* Get gvkey of current rec and current+3 record*/
  merge firm (keep=gvkey rename=(gvkey=gvkey1))
        firm (firstobs=4 keep=gvkey rename=(gvkey=gvkey4));

  /* If starting a new gvkey, reset window index */
  if lag(gvkey1)^=gvkey1 then window=0;
  window+1;
  if gvkey1^=gvkey4 then window=.;

  /* Read and write 4 records for valid windows*/
  set firm;              if window^=. then output;
  set firm (firstobs=2); if window^=. then output;
  set firm (firstobs=3); if window^=. then output;
  set firm (firstobs=4); if window^=. then output;
run;

proc forecast data=vneed out=new2 lead=1 method=expo weight=.75 nstart=3; 
  by gvkey window;
id fyear;
var sale;
run;

 

Notes:

  1. The merge statement merges observation J with observation J+3, keeping only the gvkeys (renamed to gvkey1 and gvkey4).  If gvkey1=gvkey4 then you know a valid 4 record window can be made.  That's the sole purpose of this merge statement.
  2. Calculate WINDOW.  The first window of 4 records for a gvkey will have window=1, then 2, ...   Once there are insufficient records for a window (i.e. gvkey4^=gvkey1), it's set to a missing value.
  3. Then there are 4 SET statements, with each SET being offset from the prior SET by one observation and having its own output statement, assuming window number is valid.  This populates each window with 4 sequential records.
  4. The data statement make a data set VIEW instead of a data set FILE.  This means the data set won't be constructed until it is called for later.  And instead of being written to disk, it will be read directly by the calling proc (i.e. by the proc forecast).  Saves lots of disk input/output.
  5. Note that this does not do anything about gaps in your annual data.  If you don't want windows with gaps then change the merge statement to:
          merge firm (keep=gvkey fyear rename=(gvkey=gvkey1 fyear=fyear1))
            firm (firstobs=4 keep=gvkey fyear rename=(gvkey=gvkey4 fyear=fyear4));
    and change the "if window^=." statements to
         if window^=. and fyear4-fyear1=3 then output;
  6. The proc forecast is now "by gvkey window" instead of "by gvkey".  
  7. But the proc forecast issue this warning.
    1. WARNING: There are too few nonmissing values available for the variable sale to initialize the
               smoothed trend estimates as specified by the NSTART=8 option. Trend parameters initialized
               using all 4 values available.

      So I just inserted the "NSTART=MAX" option to keep the log neat.

View solution in original post


All Replies
Super User
Posts: 528

Re: Calculate predictions and compare with actual values

You want to do smoothing based on 4-year rolling windows within company.  Show us the smoothing code, and we can probably show you how to make the windows for submission to that smoothing code.

New Contributor
Posts: 4

Re: Calculate predictions and compare with actual values

Okay,

 

Thank you. I added a raw version of the code. At this junction, it will calculate the forecast for the "next" year from all the observations in the sample. I would like it to forecast a observation from 4 previous years and do this continually throughout the sample.

 

Best,

 

Pieter

Solution
a week ago
Super User
Posts: 528

Re: Calculate predictions and compare with actual values

You could use proc expand or (I think) proc timedata to make rolling 4-year windows, but I don't recall the syntax off hand.  Also, you need to have a license for the SAS/ETS (econometrics/time series) package.

 

Here's a data step that modifies your data, so that instead of "by gvkey" you can use "by gvkey window", where window=1 for the first 4 years, window=2 for the 2nd through 5th years, etc.   That is they are rolling windows.  Here's the code:

 

 

DATA firm; 
   INPUT fyear gvkey sale; 
   DATALINES; 
1993 1013 366.1180
1994 1013 448.7350
1995 1013 586.2220
1996 1013 828.0090
1997 1013 1164.4500
1998 1013 1379.6780
1999 1013 1926.9470
2000 1013 3287.9000
2001 1013 2402.8000
2002 1013 1047.7000
2003 1013 773.2000
2004 1013 784.3000
2005 1013 1169.2000
2006 1013 1281.9000
2007 1013 1322.2000
2008 1013 1456.4000
2009 1013 996.7000
2010 1013 1156.6000
1993 1034 338.2300
1994 1034 469.2630
1995 1034 520.8820
1996 1034 486.1840
1997 1034 500.2880
1998 1034 604.5840
1999 1034 732.4430
2000 1034 900.7940
2001 1034 974.9900
2002 1034 1230.7620
2003 1034 1297.2850
2004 1034 1339.4800
2005 1034 553.6170
2006 1034 653.8280
2007 1034 722.4250
run;

data vneed / view=vneed;
  /* Get gvkey of current rec and current+3 record*/
  merge firm (keep=gvkey rename=(gvkey=gvkey1))
        firm (firstobs=4 keep=gvkey rename=(gvkey=gvkey4));

  /* If starting a new gvkey, reset window index */
  if lag(gvkey1)^=gvkey1 then window=0;
  window+1;
  if gvkey1^=gvkey4 then window=.;

  /* Read and write 4 records for valid windows*/
  set firm;              if window^=. then output;
  set firm (firstobs=2); if window^=. then output;
  set firm (firstobs=3); if window^=. then output;
  set firm (firstobs=4); if window^=. then output;
run;

proc forecast data=vneed out=new2 lead=1 method=expo weight=.75 nstart=3; 
  by gvkey window;
id fyear;
var sale;
run;

 

Notes:

  1. The merge statement merges observation J with observation J+3, keeping only the gvkeys (renamed to gvkey1 and gvkey4).  If gvkey1=gvkey4 then you know a valid 4 record window can be made.  That's the sole purpose of this merge statement.
  2. Calculate WINDOW.  The first window of 4 records for a gvkey will have window=1, then 2, ...   Once there are insufficient records for a window (i.e. gvkey4^=gvkey1), it's set to a missing value.
  3. Then there are 4 SET statements, with each SET being offset from the prior SET by one observation and having its own output statement, assuming window number is valid.  This populates each window with 4 sequential records.
  4. The data statement make a data set VIEW instead of a data set FILE.  This means the data set won't be constructed until it is called for later.  And instead of being written to disk, it will be read directly by the calling proc (i.e. by the proc forecast).  Saves lots of disk input/output.
  5. Note that this does not do anything about gaps in your annual data.  If you don't want windows with gaps then change the merge statement to:
          merge firm (keep=gvkey fyear rename=(gvkey=gvkey1 fyear=fyear1))
            firm (firstobs=4 keep=gvkey fyear rename=(gvkey=gvkey4 fyear=fyear4));
    and change the "if window^=." statements to
         if window^=. and fyear4-fyear1=3 then output;
  6. The proc forecast is now "by gvkey window" instead of "by gvkey".  
  7. But the proc forecast issue this warning.
    1. WARNING: There are too few nonmissing values available for the variable sale to initialize the
               smoothed trend estimates as specified by the NSTART=8 option. Trend parameters initialized
               using all 4 values available.

      So I just inserted the "NSTART=MAX" option to keep the log neat.
New Contributor
Posts: 4

Re: Calculate predictions and compare with actual values

Thank you so much for your help.

 

Pieter

Post a Question
Discussion Stats
  • 4 replies
  • 149 views
  • 1 like
  • 2 in conversation