BookmarkSubscribeRSS Feed
JMC
Calcite | Level 5 JMC
Calcite | Level 5

Hello,

I am forecasting time series data with hpfdiagnose and I'm running into a problem with the size of my holdout sample. The code runs quickly (<20 sec) when my holdout sample is 5-9% of data, but the forecasting starts to take incredibly long when I increase the size of my holdout sample to higher values (e.g., 10-20%).

What do you think exaplains this non-linear relationship between sample size and running time? Is there something that I can do about it?

Best,

JMC

3 REPLIES 3
udo_sas
SAS Employee

Hello -

In a way I find your findings counterintuitive, as I would expect faster run times when increasing the holdout sample values.

Would you mind to share your code and some test data to replicate your findings?

Thanks,

Udo

JMC
Calcite | Level 5 JMC
Calcite | Level 5

Hello Udo,

Thank you for your response. Unfortunately, my data is proprietary so I cannot share it. However, I have found the following.

My time series data is based on daily data. When seasonality=365, the size of the holdout sample has a large impact on how long the process takes. When seasonality=7, this no longer occurs. Why I had originally set seasonality =365 is because my data has this interesting pattern where every year there is an almost clockwork-like increase of values from the previous years. In other words, the pattern of data within a year are almost perfectly replicated the following year, but their absolute values are all higher relative to the previous year. I found that by setting seasonality=365, I was able to forecast this yearly step-wise function. With seasonality=7, it doesn't work.

Do you have any suggestions for how I could model seasonality=7 and still get the yearly jumps? I've tried adding a year regressor, but this doesn't seem to be doing the trick.

Best,

Juan Manuel

udo_sas
SAS Employee

Hello Juan Manuel -

Since your data is on daily frequency, seasonality=7 seems to be the right choice.

Of course this does not address your question about modeling the second seasonality you have discovered in your data.

I certainly understand your concern about proprietary data, but without seeing the data my advise has to stick to conceptual ideas only.

When you say that "there is an almost clockwork-like increase of values from the previous years.", would you describe this pattern as a monthly cycle or a weekly cycle - or do you see level shifts across several years?

What I'm getting at is the fact the you should be able to define either discrete events such as calendar events Jan-Dec or Week1-Week52 to model this effect. Alternatively you may want to introduce an adjustment variable which mimics the level shifts. Yet another approach might be to model your data in a 2 step manner: model on daily frequency, model on monthly frequency, and then reconcile both forecasts using the HPFTEMPRECONCILE procedure.

Hope this is useful.

Thanks,

Udo

SAS Innovate 2025: Save the Date

 SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!

Save the date!

Multiple Linear Regression in SAS

Learn how to run multiple linear regression models with and without interactions, presented by SAS user Alex Chaplin.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 3 replies
  • 1580 views
  • 0 likes
  • 2 in conversation