BookmarkSubscribeRSS Feed
pdpat43
Calcite | Level 5

I could use some guidance in selecting the most appropriate forecasting or prediction procedure given the following data, goals, and constraints:

Data:

Very large number of observations

Monthly data per person (Feb thru Dec in year 1; Jan thru Mar in year 2);

One continuous dependent variable

Goal:

Use Year 1 data to create a model

Apply early Year 2 data to predict December of Year 2

The monthly DV observations are not linear over time (simple downward slope - quadratic?)

My early attempts to predict have not been fruitful.

I have tried PROC LOESS, FORECAST, ARIMA, X12, and TRANSREG.

One of the above may be right, but the constrains of such large data mean long processing times and (commonly) insufficient memory.

I'd appreciate any guidance regarding the most suitable method so I can subset the data and try again.

Thanks!ss

6 REPLIES 6
pdpat43
Calcite | Level 5

No, I'll give PROC ESM a try as well.

At first glance, I don't see a SCORE option to predict on the Year 2 Jan-Mar values.

Am I missing something?

Thanks, Ksharp.

ets_kps
SAS Employee

Hi Ksharp,

The ESM procedure uses a lead= option rather than a score statement.  you can see the syntax here.

SAS/ETS(R) 13.1 User's Guide

Let us know if you need any help. -Ken

udo_sas
SAS Employee

Hello -

If you are considering using time series techniques such as exponential smoothing, then your idea of: "Use Year 1 data to create a model Apply early Year 2 data to predict December of Year 2" will not work.

Time series models are closely tied to the data which is used to estimate parameters. This is very different to techniques like OLS regression.

For example: you can "train" a predictive model such as a logistic regression on training data, create a score file, and then apply this score file to new data.

This concept does not apply to statistical forecasting models - here you should use all history available to estimate the parameters of the model - usually the most recent data is the most relevant. Also, once you have estimated the parameters, these models are usually tied to the history which  was used for estimation. For exponential smoothing models for example you can think of floating average with infinite memory but with exponentially falling weights.

In my opinion the question of whether to use a predictive model or a statistical forecasting model depends on your business question which you have in mind - note that for both areas very scalable algorithms are available.

If you can share a some example data and specify how you would the results to look like, we might be able to come up with some code snippet for you.

Thanks,

Udo

pdpat43
Calcite | Level 5

Udo,

Thank you for the thorough response.

Here is a summarized version of what I have available:

MonthYearVar1
12013
2201394%
3201393%
4201392%
5201391%
6201391%
7201380%
8201390%
9201389%
10201388%
11201387%
12201387%
12014
2201495%
3201494%
4201493%
5201493%
62014
72014
82014
92014
102014
112014
122014

To make matters more complicated, Var1 is a rolling YTD average.

Does the "Accumulate=average" subcommand account for the heavier weight toward year's end?

Thank you!

udo_sas
SAS Employee

Hello -

Many thanks for sharing an example - I don't think that statistical forecasting techniques such as exponential smoothing will be applicable for your situation - due to the lack of history. After plotting your data I was thinking that you might be better off using a curve fitting technique such as LOESS to your data and try to come up with a "profile" which can be applied to future points. However, again the lack of historic data will be an issue, unless you assume that 2013 is a strong representative of 2014.

This e-newsletter might give you some ideas: http://support.sas.com/community/newsletters/training/forecasting.html

Hope this makes sense.

Thanks,

Udo

sas-innovate-2024.png

Don't miss out on SAS Innovate - Register now for the FREE Livestream!

Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.

 

Register now!

What is ANOVA?

ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 6 replies
  • 1839 views
  • 6 likes
  • 4 in conversation