03-28-2012 02:13 PM
Hi, I realize the subject line may confuse many, but here's what I'm wondering if someone can help. I have over 1500 time series (stores by products) that have different end dates. For example one store may have thier Price data from Jan'1996 to Jun'2010 and another from March'2000 to Dec'2011, and so on for each different Sore and Product. I also do have other variables other than Price that also ends differently by month and year for each Store x Product (since it's from many different sources). Is there a way in Proc Timeseries or any other procedure that I can compute (in a sense forecast) all data to have a similar end-date for each StorexProduct (for example Dec'2011). I eventually want to use ARIMAX to forecast Prices for the next 12 months, but I have to get all series to the same ending line, or starting line depends how you look at it.... Thanks in advance.
03-28-2012 03:34 PM
Have a look at the START and END options of the ID statement of PROC TIMESERIES.
Combined with the SETMISSING statement you can force all series to start and end at the same data and impute the missing values accordingly if needed.
Example (will work with BY variables, too):
proc timeseries data=sashelp.workers out=want;
id date interval=month start="01JAN76"d end="31DEC1982"d setmissing=mean;
var electric masonry;
03-28-2012 04:38 PM
Thanks Udo, my only problem is that none of the setmissing options make sense for my data. What I mean by that is that I have a cyclical and season and trend with different hills and valleys all over the place. If I do a mean, it will input a straight line from the last point of the known data. Any idea what I can use to look at previous data wiht the cycles and trends, etc, to get a non linear forecast. Or do I have to use ARIMA if I want that.? Thanks
03-28-2012 05:11 PM
PROC ESM for example features the END option of the ID statement as well.
It might be worthwhile to try and run it on your data without replacing the newly introduced missing values at the end of the series.
proc esm data=sashelp.workers out=_null_ outfor=want lead=12 plot=forecasts;
id date interval=month start="01JAN76"d end="31DEC1983"d;
forecast electric masonry / model=winters;
*Note that I'm introducing more than a year of missing values at the end of the series - the multistep forecast handles this pretty well with my test data;
Alternatively you could consider a 2 step process: first run TIMESERIES to make sure that all series end at the same date and then impute the missing values with some more elaborate technique which you have in mind.
03-30-2012 09:59 AM
Thanks Udo, I'll give it a try.. Here are my 2 options from what I understand.
1) Use proc esm to forecast all the ending missing data to a set date eg. 31Dec2011
2) Use proc timeseries to end at a certain date eg.31Dec2011 and then use something like ARIMA to forecast the current missing data .. This way is still a littl econfusing as ARIMA will start at the Dec2011 date and not sure how it will fit in each Store x Product differently.. I will try it. '
11-02-2012 11:02 AM
We use a data step to create a one row table where date = enddate for each time series, then append that to our time series data.
Then, we use proc expand to zero-fill all of the missing days. Now, I know that you do not want to zero-fill, but proc expand does have different interpolation options such as moving average, spline, etc. It could help you vs. proc arima because you can use a by variable in proc expand to do this to all of your time series at once (we fill in all the missing dates across all time series and zero-fill them in one procedure).