I am currently in the process of adding confidence intervals to forecasts that are created using PROC ESM, using the various model options. I am trying to understand how SAS creates the standard deviations and confidence intervals and decide if I am comfortable using them as they come out of SAS, or if I should calculate my own or tweak them.
How are the CI's created? I assume this is by using the assumption of normality and the standard deviation. I tested this and found it to be approximately so on the data that is attached. (columns I and J, compared to columns E and F).
How are the SD's created? These have me very confused, for 3 reasons.
1) They are increasing over time (as we go out more periods in the future). I had never thought about it, but I suppose this makes sense- the forecast for 3 periods from now is going to have less information used in the forecast than the one 2 periods from now. But on what basis? I've included 2 series of data and the % change over time in the SD is markedly different between them. (Col L)
2) These are weekday seasonal forecasts so I assume that there are differences in SD by day. My second calculation (Col M) compares the change in week-over-week. The SD % Changes by day, and between series, are vast differences. (FYI: These 2 series are using different models available from PROC ESM, one that results in the 2nd week having the same predicted value, the other with different values for the 2nd week. I am not sure if this is important for this question or not).
3) The SD for period 1 is not at all similar to the SD for the actual data. Using SAS to calculate SD, I get 105,901 for Series1 (compared to 66,557 for period 1) and 10,363 for Series 2 (compared to 6,671). So the SD is smaller (for period 1, not necessarily all periods) for the forecast than the actual, which could be nice if it gives me a tighter CI based on lower volatility recently (which is the case in this data).
Why it's important:
I'm not necessarily looking for the most accurate CI, statistically speaking--I'd like to improve the CI algorithm over time perhaps but initially it is not extremely important. My audience is not accustomed to thinking probabilistically and by adding these CI's I want to make baby steps in that direction. My hope is that the variance of CI's both between series, and over time, will give them some insight as the non-homogeneity of the work we're doing in terms of volatility and predictability. I do not believe that CIs that grow dramatically from the beginning of the 2 week period to the end will be beneficial towards that end; rather, it will be confusing and is going to result in many more zeroes on the lower end. We cannot have negative work so they already implicitly understand that zero is a floor-- including it too much as a lower CI may lead them to lose confidence (ironically).
The sample output is from the outfor= data set from the PROC ESM statement, with a few irrelevant columns removed and my calculations added on the right.
NOTE: This question may only apply to PROC ESM, and not all SAS forecast procedures. We are transitioning into ESM but were using PROC FORECAST. It provides CIs in a very different manner. The CI, within one series, is the same % change down (Lower CI) and up (Upper CI) for every forecasted period. For series 1 of the example I attached, the CI is 63.5% in each direction from the predicted value for all 14 days. I think I may be amenable to one that changes over time (to reflect growing uncertainty, though not so much variance as I currently get; and, one that varies the SD by week day, though I do not know for sure if ESM is doing that). The PROC FORECAST algorithm is definitely not equal to the Predict +/- 2*SD method using the actual SD's I mentioned above. It somehow picks a % to adjust by, and then forces the SD/CI to adhere to that regardless of the predicted value or # of periods in advance.
Hello -
I think the ESM documentation states that: "The techniques used in the ESM procedure are identical to those used for exponential smoothing models in the Time Series Forecasting". You will find a lot of information about how confidence intervals to forecasts that are created using PROC ESM here:
Also I would recommend to have a look at this White Paper: Large-Scale Automatic Forecasting Using Inputs and Calendar Events which gives some nice explanations of CI which could be beneficial for your audience as well.
Thanks,
Udo
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how to run multiple linear regression models with and without interactions, presented by SAS user Alex Chaplin.
Find more tutorials on the SAS Users YouTube channel.