Turn on suggestions

Auto-suggest helps you quickly narrow down your search results by suggesting possible matches as you type.

Showing results for

- Home
- /
- Analytics
- /
- Forecasting
- /
- Re: Handling Time Variable in Time Series Modeling

Options

- RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Mute
- Printer Friendly Page

🔒 This topic is **solved** and **locked**.
Need further help from the community? Please
sign in and ask a **new** question.

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

Posted 12-20-2017 11:28 PM
(1555 views)

Hello, I have a question about handling the TIME variable in my time series analysis. Hope someone can help 🙂

Below is my data:

```
data sales;
input Week weekday Target;
datalines;
1 4 539.577
1 5 224.675
1 6 129.412
2 2 317.12
2 3 210.517
2 4 207.364
2 5 263.043
2 6 248.958
3 2 344.291
3 3 248.428
3 4 281.42
3 5 243.568
3 6 308.178
4 2 363.402
4 3 336.872
4 4 246.992
4 5 308.88
4 6 233.126
5 2 404.38
1 3 298.56
1 4 229.249
1 5 236.304
1 6 297.174
2 2 409.401
2 3 231.035
2 4 238.826
2 5 235.598
2 6 242.112
3 2 490.79
3 3 289.657
3 4 298.459
3 5 323.603
4 3 616.453
4 4 346.035
4 5 307.645
4 6 253.847
5 2 530.944
5 3 333.359
5 4 306.356
1 6 416.83
2 2 415.187
2 3 268.002
2 4 234.503
2 5 234.724
2 6 230.064
3 2 357.394
3 3 259.246
3 4 244.235
3 5 402.607
3 6 255.061
4 2 342.606
4 3 268.64
4 4 188.601
4 5 202.022
4 6 213.509
5 2 316.849
5 3 286.412
5 4 303.447
5 5 304.95
5 6 331.9
;
run;
```

So basically, I have three columns. The first two are the WEEK and WEEKDAY columns. The WEEK column contains the week number and the second column contains the day of the week (e.g. 2=Monday, 3=Tuesday, 4=Wednesday...etc).

Just as an example, the first observation has WEEK=1 and WEEKDAY=4. This is the time point for the Wednesday in week 1.

I'm modeling the TARGET variable using Proc Arima and Proc ESM. I believe I need one TIME variable for the x-axis. There are in total 60 observations. I did a quick check and saw that the response is substantially higher on Monday and lower on the rest of the week.

I'm wondering how I can establish the TIME variable that reflects the weekday information. For example, if I treat the TIME simply as something from 1 to 60, I will lose the seasonal component of the data (because there are some missing days due to holiday).

I believe I have two options:

- Convert the WEEK and WEEKDAY columns into an actual date column. That should account for the weekday information.
- Create a new TIME variable that goes from 1 to 60 and use it as my TIME variable on the x-axis. Create dummy input variables for Monday to Friday.

I'm fairly new to time series. Does anyone have any comment about how this situation should be handled?

Thanks!

1 ACCEPTED SOLUTION

Accepted Solutions

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

The best way would be to create a date variable to use for the time axis. You will need rows for the Saturday and Sunday dates with missing values for the target variable.

Then you can include weekly differencing in estimate statements for fitting the model. The following will evaluate daily differences and weekly.

identify var=target; estimate p=(1 7);

In the forecasting stage, you can use the ID and INTERVAL options to specify it is daily data.

The ID and INTERVAL can be specified in the FORECAST statement, for example,

forecast id=date interval=day;

You shouldn't need a dummy variable for weekday after doing this, at least in PROC ARIMA. It should see if weekly differencing helps with the model based on the INTERVAL=DAY option.

You can also find more information here:

PROC ESM can be approached similarly, except the ID variable is specified in its own statement, for example:

proc esm data=<input-data-set> out=<output-data-set>; id <time-ID-variable> interval=<frequency>; forecast <time-series-variables>; run;

Here is another link to the PROC ESM documentation

2 REPLIES 2

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

The best way would be to create a date variable to use for the time axis. You will need rows for the Saturday and Sunday dates with missing values for the target variable.

Then you can include weekly differencing in estimate statements for fitting the model. The following will evaluate daily differences and weekly.

identify var=target; estimate p=(1 7);

In the forecasting stage, you can use the ID and INTERVAL options to specify it is daily data.

The ID and INTERVAL can be specified in the FORECAST statement, for example,

forecast id=date interval=day;

You shouldn't need a dummy variable for weekday after doing this, at least in PROC ARIMA. It should see if weekly differencing helps with the model based on the INTERVAL=DAY option.

You can also find more information here:

PROC ESM can be approached similarly, except the ID variable is specified in its own statement, for example:

proc esm data=<input-data-set> out=<output-data-set>; id <time-ID-variable> interval=<frequency>; forecast <time-series-variables>; run;

Here is another link to the PROC ESM documentation

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

So there are several ways to approach this, but your data set is missing necessary information for creating a date variable. You can easily convert these, but you have data that lists five weeks, and then repeats the same weeks, which leads me to the following questions about your data: Are there by variables? Do you know the year? Are these repeated observations for the same set of weeks, or are these different years or just by groups?

You could also use seasonality=5 on the proc statement, which would tell it that there is a seasonality of 5, as you would guess, but I'm not sure how you're using PROC ESM or ARIMA without a date variable?

I have drafted a little bit of code that could help you, but I am making the assumption that the year is 2017 for all of your date variables and your data is not sorted by week, thus you have repeated observations. You will notice there will be a large gap in data if this spans multiple years since there are 52 weeks in a year and you only have five.

I added this code above your "datalines" statement

the first line will get the date for the Sunday of the given week, you then add the number of days to get the actual day. There might be a more sophisticated way to do this with multipliers and shifts, but this was the easiest. I added a format to show you the dates will give you the weekday as given. More information on this is given in the ETS doc chapter "Date Intervals, Formats, and Functions"

datestart=intnx('week','01JAN2017'd,week);

actualdate=intnx('day',datestart,weekday-1);

format datestart actualdate weekdate.;

**Don't miss out on SAS Innovate - Register now for the FREE Livestream!**

Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.

Multiple Linear Regression in SAS

Learn how to run multiple linear regression models with and without interactions, presented by SAS user Alex Chaplin.

Find more tutorials on the SAS Users YouTube channel.