BookmarkSubscribeRSS Feed
pdpat43
Calcite | Level 5

I could use some guidance in selecting the most appropriate forecasting or prediction procedure given the following data, goals, and constraints:

Data:

Very large number of observations

Monthly data per person (Feb thru Dec in year 1; Jan thru Mar in year 2);

One continuous dependent variable

Goal:

Use Year 1 data to create a model

Apply early Year 2 data to predict December of Year 2

The monthly DV observations are not linear over time (simple downward slope - quadratic?)

My early attempts to predict have not been fruitful.

I have tried PROC LOESS, FORECAST, ARIMA, X12, and TRANSREG.

One of the above may be right, but the constrains of such large data mean long processing times and (commonly) insufficient memory.

I'd appreciate any guidance regarding the most suitable method so I can subset the data and try again.

Thanks!ss

6 REPLIES 6
pdpat43
Calcite | Level 5

No, I'll give PROC ESM a try as well.

At first glance, I don't see a SCORE option to predict on the Year 2 Jan-Mar values.

Am I missing something?

Thanks, Ksharp.

ets_kps
SAS Employee

Hi Ksharp,

The ESM procedure uses a lead= option rather than a score statement.  you can see the syntax here.

SAS/ETS(R) 13.1 User's Guide

Let us know if you need any help. -Ken

udo_sas
SAS Employee

Hello -

If you are considering using time series techniques such as exponential smoothing, then your idea of: "Use Year 1 data to create a model Apply early Year 2 data to predict December of Year 2" will not work.

Time series models are closely tied to the data which is used to estimate parameters. This is very different to techniques like OLS regression.

For example: you can "train" a predictive model such as a logistic regression on training data, create a score file, and then apply this score file to new data.

This concept does not apply to statistical forecasting models - here you should use all history available to estimate the parameters of the model - usually the most recent data is the most relevant. Also, once you have estimated the parameters, these models are usually tied to the history which  was used for estimation. For exponential smoothing models for example you can think of floating average with infinite memory but with exponentially falling weights.

In my opinion the question of whether to use a predictive model or a statistical forecasting model depends on your business question which you have in mind - note that for both areas very scalable algorithms are available.

If you can share a some example data and specify how you would the results to look like, we might be able to come up with some code snippet for you.

Thanks,

Udo

pdpat43
Calcite | Level 5

Udo,

Thank you for the thorough response.

Here is a summarized version of what I have available:

MonthYearVar1
12013
2201394%
3201393%
4201392%
5201391%
6201391%
7201380%
8201390%
9201389%
10201388%
11201387%
12201387%
12014
2201495%
3201494%
4201493%
5201493%
62014
72014
82014
92014
102014
112014
122014

To make matters more complicated, Var1 is a rolling YTD average.

Does the "Accumulate=average" subcommand account for the heavier weight toward year's end?

Thank you!

udo_sas
SAS Employee

Hello -

Many thanks for sharing an example - I don't think that statistical forecasting techniques such as exponential smoothing will be applicable for your situation - due to the lack of history. After plotting your data I was thinking that you might be better off using a curve fitting technique such as LOESS to your data and try to come up with a "profile" which can be applied to future points. However, again the lack of historic data will be an issue, unless you assume that 2013 is a strong representative of 2014.

This e-newsletter might give you some ideas: http://support.sas.com/community/newsletters/training/forecasting.html

Hope this makes sense.

Thanks,

Udo

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

What is ANOVA?

ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 6 replies
  • 1819 views
  • 6 likes
  • 4 in conversation