This juletip is a friendly reminder to look at all the options when transforming data for time series analysis, forecasting or predictive modeling. Provided examples come from the time series world.
Often when working with econometric models, a couple of assumptions should be met. Mostly, models’ usability and precision is validated by examining the distribution of the residuals and testing the residuals for independence.
If these assumptions aren’t met, one remedy to get “better” residuals is to use the logarithm of the target variable instead of the actual value of the target variable. This is also taught in many classes. This is all good, but there are many other options, as this juletip tries to point out.
After this technique is used, and the logarithm still doesn’t do the trick to enhance the residuals, one of three methods is normally applied:
The ostrich method: Consider the results useful anyway. Never mention the residuals again. Maybe say that they are “good enough”.
The run away method: Try another model which might be less suited for the analytics case, but is better at fulfilling assumptions.
The Garfield method: Give up and go get some rest instead.
However, there are many more transformations than just the logarithm to use, and these are very easy to implement and use in SAS. Especially when creating time series data through PROC TIMESERIES.
These are (including the logarithm!):
The logarithm of the target variable
The Box-Cox transformation of the target variable
The logistic transformation
The square root of the target variable
Have in mind that these need the time series to be strictly positive.
And here is what the syntax looks like:
Check them out before you become an ostrich, an escape artist, or a comfortable old cat!
If your target is forecasting or prediction and you’ve finished your analysis with golden, shiny residuals – do not forget to transform your target variable and the forecast/prediction of your target variable back to its normal, i.e. if you use the square root to transform your data, square it back afterwards!
- For other econometric models there are other procedures which’ll help you transform your data to make it apt for analysis, for example PROC TRANSREG.
- In time series analysis, ask yourself if you are using the right accumulation for your time series; is it always the total or average you need to be looking at? For capacity forecasts, the maximum might be more apt.
- For time series analysis, PROC EXPAND provides even more transformations. Maybe Santa will tell you all about it next Christmas. Until then – or if he doesn’t - don’t forget our community and the support site!
Registration is open! SAS is returning to Vegas for an AI and analytics experience like no other! Whether you're an executive, manager, end user or SAS partner, SAS Innovate is designed for everyone on your team. Register for just $495 by 12/31/2023.
If you are interested in speaking, there is still time to submit a session idea. More details are posted on the website.