BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
kisumsam
Quartz | Level 8

Hello, I have a question about handling the TIME variable in my time series analysis. Hope someone can help 🙂

 

Below is my data:

 

data sales;
input Week weekday Target;
datalines;
1 4 539.577
1 5 224.675
1 6 129.412
2 2 317.12
2 3 210.517
2 4 207.364
2 5 263.043
2 6 248.958
3 2 344.291
3 3 248.428
3 4 281.42
3 5 243.568
3 6 308.178
4 2 363.402
4 3 336.872
4 4 246.992
4 5 308.88
4 6 233.126
5 2 404.38
1 3 298.56
1 4 229.249
1 5 236.304
1 6 297.174
2 2 409.401
2 3 231.035
2 4 238.826
2 5 235.598
2 6 242.112
3 2 490.79
3 3 289.657
3 4 298.459
3 5 323.603
4 3 616.453
4 4 346.035
4 5 307.645
4 6 253.847
5 2 530.944
5 3 333.359
5 4 306.356
1 6 416.83
2 2 415.187
2 3 268.002
2 4 234.503
2 5 234.724
2 6 230.064
3 2 357.394
3 3 259.246
3 4 244.235
3 5 402.607
3 6 255.061
4 2 342.606
4 3 268.64
4 4 188.601
4 5 202.022
4 6 213.509
5 2 316.849
5 3 286.412
5 4 303.447
5 5 304.95
5 6 331.9
;
run;

So basically, I have three columns. The first two are the WEEK and WEEKDAY columns. The WEEK column contains the week number and the second column contains the day of the week (e.g. 2=Monday, 3=Tuesday, 4=Wednesday...etc).

 

Just as an example, the first observation has WEEK=1 and WEEKDAY=4. This is the time point for the Wednesday in week 1.

 

I'm modeling the TARGET variable using Proc Arima and Proc ESM. I believe I need one TIME variable for the x-axis. There are in total 60 observations. I did a quick check and saw that the response is substantially higher on Monday and lower on the rest of the week.

 

I'm wondering how I can establish the TIME variable that reflects the weekday information. For example, if I treat the TIME simply as something from 1 to 60, I will lose the seasonal component of the data (because there are some missing days due to holiday). 

 

I believe I have two options:

  1. Convert the WEEK and WEEKDAY columns into an actual date column. That should account for the weekday information.
  2. Create a new TIME variable that goes from 1 to 60 and use it as my TIME variable on the x-axis. Create dummy input variables for Monday to Friday.

I'm fairly new to time series. Does anyone have any comment about how this situation should be handled?

 

Thanks!

1 ACCEPTED SOLUTION

Accepted Solutions
solarflare
SAS Employee

The best way would be to create a date variable to use for the time axis. You will need rows for the Saturday and Sunday dates with missing values for the target variable. 

 

Then you can include weekly differencing in estimate statements for fitting the model. The following will evaluate daily differences and weekly.

 

identify var=target;
   estimate p=(1 7);

In the forecasting stage, you can use the ID and INTERVAL options to specify it is daily data.

 

The ID and INTERVAL can be specified in the FORECAST statement, for example,

 

forecast id=date interval=day;

 

You shouldn't need a dummy variable for weekday after doing this, at least in PROC ARIMA. It should see if weekly differencing helps with the model based on the INTERVAL=DAY option.

 

You can also find more information here:

PROC ARIMA documentation

 

 

PROC ESM can be approached similarly, except the ID variable is specified in its own statement, for example:

 

proc esm data=<input-data-set> out=<output-data-set>;
   id <time-ID-variable> interval=<frequency>;
   forecast <time-series-variables>;
run;

 

Here is another link to the PROC ESM documentation 

View solution in original post

2 REPLIES 2
solarflare
SAS Employee

The best way would be to create a date variable to use for the time axis. You will need rows for the Saturday and Sunday dates with missing values for the target variable. 

 

Then you can include weekly differencing in estimate statements for fitting the model. The following will evaluate daily differences and weekly.

 

identify var=target;
   estimate p=(1 7);

In the forecasting stage, you can use the ID and INTERVAL options to specify it is daily data.

 

The ID and INTERVAL can be specified in the FORECAST statement, for example,

 

forecast id=date interval=day;

 

You shouldn't need a dummy variable for weekday after doing this, at least in PROC ARIMA. It should see if weekly differencing helps with the model based on the INTERVAL=DAY option.

 

You can also find more information here:

PROC ARIMA documentation

 

 

PROC ESM can be approached similarly, except the ID variable is specified in its own statement, for example:

 

proc esm data=<input-data-set> out=<output-data-set>;
   id <time-ID-variable> interval=<frequency>;
   forecast <time-series-variables>;
run;

 

Here is another link to the PROC ESM documentation 

Jennifer_beeman
SAS Employee

So there are several ways to approach this, but your data set is missing necessary information for creating a date variable.  You can easily convert these, but you have data that lists five weeks, and then repeats the same weeks, which leads me to the following questions about your data: Are there by variables? Do you know the year?  Are these repeated observations for the same set of weeks, or are these different years or just by groups? 

 

You could also use seasonality=5 on the proc statement, which would tell it that there is a seasonality of 5, as you would guess, but I'm not sure how you're using PROC ESM or ARIMA without a date variable?

 

I have drafted a little bit of code that could help you, but I am making the assumption that the year is 2017 for all of your date variables and your data is not sorted by week, thus you have repeated observations. You will notice there will be a large gap in data if this spans multiple years since there are 52 weeks in a year and you only have five.

 

I added this code above your "datalines" statement

the first line will get the date for the Sunday of the given week, you then add the number of days to get the actual day. There might be a more sophisticated way to do this with multipliers and shifts, but this was the easiest. I added a format to show you the dates will give you the weekday as given. More information on this is given in the ETS doc chapter "Date Intervals, Formats, and Functions"

 

datestart=intnx('week','01JAN2017'd,week);
actualdate=intnx('day',datestart,weekday-1);
format datestart  actualdate weekdate.;

sas-innovate-2024.png

Don't miss out on SAS Innovate - Register now for the FREE Livestream!

Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.

 

Register now!

Multiple Linear Regression in SAS

Learn how to run multiple linear regression models with and without interactions, presented by SAS user Alex Chaplin.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 2 replies
  • 1556 views
  • 2 likes
  • 3 in conversation