As the developer of the SAS HOLIDAY function, the DATEKEYS procedure, the HPFEVENTS procedure, and the EVENT object of the ATSM package, "When is the Super Bowl?" is a question that I'm often asked. It's logical to assume that if SAS can calculate the date of Easter and Christmas, then surely the date of the most important event of the year could be calculated, correct?
The problem is that the dates for the Super Bowl can't be calculated because they are determined by a committee. The Super Bowl dates and locations are selected a few years in advance. However, the Super Bowl date has been changed in the past; in 2002 the Super Bowl date was changed "from the original date of January 27 to February 3" . This year, the NFL is already considering moving the date of the Super Bowl due to the COVID-19 pandemic. At SAS, if our software answers a question, we make our very best effort for the answer to be correct! It's easy to know the dates of Super Bowls that have already occurred, and analyzing historical data is how forecasting methods estimate the future effect. To forecast, it's important to know the future dates. In order to accurately estimate the past effects and forecast the future effects, the dates have to be maintained on a regular basis.
Sometimes even dates that can be calculated are changed. Mardi Gras is a large celebration in New Orleans that is based on the date of Easter. Easter can be calculated. However, in 2002 the dates for Mardi Gras were changed to accomodate Super Bowl XXXVI. This occurred again in 2013 for Super Bowl XLVII. The next conflict between Mardi Gras and Super Bowl is in 2024 for Super Bowl LVIII. The outcome for 2024 has yet to be determined. However, so far, the score is Super Bowl 2, Mardi Gras 0!
If the dates can't be calculated or are subject to change, the next best solution is to develop a system for maintaining the dates. It would also be useful to have a method for estimating future dates. This can be accomplished with SAS software.
In SAS Foundation, the HPFEVENTS procedure can generate a data set that contains an event repository. The event repository contains definitions to create dummy variables for events. Here is a simple example that uses the known and expected Super Bowl dates that are current as of June 30, 2021.
/* Super Bowl and dates
Information obtained from:
MOST RECENT URLS (since 2018):
http://www.nfl.com/superbowl/history
https://en.wikipedia.org/wiki/Super_Bowl
https://en.wikipedia.org/wiki/Super_Bowl_XLVI
https://en.wikipedia.org/wiki/Super_Bowl_XLVII
https://en.wikipedia.org/wiki/Super_Bowl_XLVIII
https://en.wikipedia.org/wiki/Super_Bowl_XLIX
https://en.wikipedia.org/wiki/Super_Bowl_50
https://en.wikipedia.org/wiki/Super_Bowl_LI
https://en.wikipedia.org/wiki/Super_Bowl_LII
https://en.wikipedia.org/wiki/Super_Bowl_LIII
https://en.wikipedia.org/wiki/Super_Bowl_LIV
https://en.wikipedia.org/wiki/Super_Bowl_LV
https://en.wikipedia.org/wiki/Super_Bowl_LVI
https://en.wikipedia.org/wiki/Super_Bowl_LVII
https://en.wikipedia.org/wiki/Super_Bowl_LVIII
https://en.wikipedia.org/wiki/Draft:Super_Bowl_LIX
PREVIOUS URLS:
http://www.superbowl.com/history/mvps
http://www.superbowl.com/features/futuresites
(The asterisks represent the dates of Superbowls that have not yet been played - dates subject to change.)
This information is supplied without any warranty of correctness.
Number Date
LIX Feb. 2, 2025*
LVIII Feb. 4, 2024*
LVII Feb. 5, 2023*
LVI Feb. 6, 2022*
LV Feb. 7, 2021*
LIV Feb. 2, 2020
LIII Feb. 3, 2019
LII Feb. 4, 2018
LI Feb. 5, 2017
50 Feb. 7, 2016
XLIX Feb, 1, 2015
XLVIII Feb. 2, 2014
XLVII Feb. 3, 2013
XLVI Feb. 5, 2012
XLV Feb. 6, 2011
XLIV Feb. 7, 2010
XLIII Feb. 1, 2009
XLII Feb. 3, 2008
XLI Feb. 4, 2007
XL Feb. 5, 2006
XXXIX Feb. 6, 2005
XXXVIII Feb. 1, 2004
XXXVII Jan. 26, 2003
XXXVI Feb. 3, 2002
XXXV Jan. 28, 2001
XXXIV Jan. 30, 2000
XXXIII Jan. 31, 1999
XXXII Jan. 25, 1998
XXXI Jan. 26, 1997
XXX Jan. 28, 1996
XXIX Jan. 29, 1995
XXVIII Jan. 30, 1994
XXVII Jan. 31, 1993
XXVI Jan. 26, 1992
XXV Jan. 27, 1991
XXIV Jan. 28, 1990
XXIII Jan. 22, 1989
XXII Jan. 31, 1988
XXI Jan. 25, 1987
XX Jan. 26, 1986
XIX Jan. 20, 1985
XVIII Jan. 22, 1984
XVII Jan. 30, 1983
XVI Jan. 24, 1982
XV Jan. 25, 1981
XIV Jan. 20, 1980
XIII Jan. 21, 1979
XII Jan. 15, 1978
XI Jan. 9, 1977
X Jan. 18, 1976
IX Jan. 12, 1975
VIII Jan. 13, 1974
VII Jan. 14, 1973
VI Jan. 16, 1972
V Jan. 17, 1971
IV Jan. 11, 1970
III Jan. 12, 1969
II Jan. 14, 1968
I Jan. 15, 1967
*/
/* this creates a data set named superbowl that defines the superbowl event */
proc hpfevents;
eventdef SuperBowl = '15JAN1967'D '14JAN1968'D '12JAN1969'D '11JAN1970'D
'17JAN1971'D '16JAN1972'D '14JAN1973'D '13JAN1974'D '12JAN1975'D
'18JAN1976'D '09JAN1977'D '15JAN1978'D '21JAN1979'D '20JAN1980'D
'25JAN1981'D '24JAN1982'D '30JAN1983'D '22JAN1984'D '20JAN1985'D
'26JAN1986'D '25JAN1987'D '31JAN1988'D '22JAN1989'D '28JAN1990'D
'27JAN1991'D '26JAN1992'D '31JAN1993'D '30JAN1994'D '29JAN1995'D
'28JAN1996'D '26JAN1997'D '25JAN1998'D '31JAN1999'D '30JAN2000'D
'28JAN2001'D '03FEB2002'D '26JAN2003'D '01FEB2004'D '06FEB2005'D
'05FEB2006'D '04FEB2007'D '03FEB2008'D '01FEB2009'D '07FEB2010'D
'06FEB2011'D '05FEB2012'D '03FEB2013'D '02FEB2014'D '01FEB2015'D
'07FEB2016'D '05FEB2017'D '04FEB2018'D '03FEB2019'D '02FEB2020'D
'07FEB2021'D '06FEB2022'D '05FEB2023'D '04FEB2024'D '02FEB2025'D
/ PULSE=DAY ;
eventdata out=superbowl condense;
eventdata out=superbowl_full;
run;
title 'Results of HPFEVENTS - Condensed';
proc print data=superbowl;
run;
Here are the first 15 observations of the condensed superbowl data set:
In SAS Viya, the EVENT object of the ATSM package also stores event definitions and creates dummy variables. If you have a data set output from the HPFEVENTS procedure containing event definitions, that data set can be imported as a table into an INEVENT object of the INOBJ= option of the TSMODEL procedure. However, the full, non-condensed, version of the data set must be used as input to TSMODEL. The following example imports the full data set from the previous example:
/* the inEVENT table can import from HPFEVENTS, but it must be a full data set */
data sascas1.SuperBowl;
set superbowl_full;
run;
/* to verify dates, create a time id that covers the range */
data SBDateRange;
start=MDY(1,1,1967);
end=MDY(12,31,2025);
do date = start to end;
obs = date - start + 1;
output;
end;
format date DATE.;
run;
data sascas1.SBDateRange;
set SBDateRange;
run;
proc tsmodel data = sascas1.SBDateRange
LOGCONTROL= (ERROR = KEEP WARNING = KEEP NOTE=KEEP)
outlog = sascas1.OUTLOG_ind (replace = YES)
inobj = (
inEVENT = sascas1.SuperBowl
)
outobj = (
outEVDUM = sascas1.SuperDummy (replace = YES)
outEVENT = sascas1.TSMSuperBowl (replace = YES)
)
errorstop = YES
;
id date interval=day ;
var obs;
require atsm;
submit;
/* the default version of the input table is HPFEVENTS */
declare object inEVENT(inevent);
declare object dataFrame(tsdf);
declare object eventDB(event);
rc = eventDB.Initialize();
/* Replay the previous outEVENT table, now inEVENT, to transfer the
event definitions to eventDB */
rc = eventDB.Replay(inEVENT);
/* collect the definitions */
declare object outEVENT(outevent);
rc = outEVENT.collect(eventDB);
rc = dataFrame.Initialize(); if rc < 0 then do; stop; end;
/* add all currently defined events (SuperBowl) to
the dataFrame */
rc = dataFrame.AddEvent(eventDB, '_all_');
/* collect event dummy variable SuperBowl */
declare object outEvDum(outEventDummy);
rc = outEvDum.collect(dataFrame);
endsubmit;
run;
quit;
/* show that the dummy variable is non-zero on Super Bowl dates */
title "Super Bowl Dummy Variable Non-Zero Observations";
proc print data=sascas1.SuperDummy(where=(X NE 0));
run;
Examining the first 15 observations of the non-zero values of the SuperDummy data set, we see that the non-zero observations coincide with the dates of the Super Bowl:
Now, an astute observer might take note that since 2004, the Super Bowl dates have all been on the first Sunday in February. (As noted previously, that could change this year.) The announced Super Bowl dates only extend to 2025. What if you are doing a 10 year forecast? The first Sunday in February can be calculated by SAS using the same methodology as the NWKDOM function. It is possible to calculate the first Sunday in February from 2026 forward and store that as an event definition. Then the known event definition for the Super Bowl can be combined with the estimated dates. This would allow you to forecast into the future using your best estimated dates.
In the next example, N1W1FEBYR is the SAS predefined event for the 1st (N1) Sunday (W1) in February (FEB) every year (YR). LS01FEB2026D is a level shift beginning in February 1, 2026. Combination events are defined using the EventComb method of the EVENT object. When the level shift event and the N1W1FEBYR events are combined using element-wsie multiplication, the zeros before the level shift date eliminate the N1W1FEBYR dates before the level shift date, and the ones after the level shift date select the N1W1FEBYR dates from 2026 into the future. Thus, the level shift acts as a filter, and only the future dates are used in the definition. When the estimated dates for the Super Bowl, SuperExt, are combined using element-wise addition with the known Super Bowl dates, SuperBowl, this results in a union of the dates. So, the event definition before February of 2026 is given using the known data, and the event definition in February 2026 and beyond is given using the approximation.
/* to verify dates, create a time id that covers the extended range */
data SBDateRangeExt;
start=MDY(1,1,1967);
end=MDY(12,31,2030);
do date = start to end;
obs = date - start + 1;
output;
end;
format date DATE.;
run;
data sascas1.SBDateRangeExt;
set SBDateRangeExt;
run;
proc tsmodel data = sascas1.SBDateRangeExt
LOGCONTROL= (ERROR = KEEP WARNING = KEEP NOTE=KEEP)
outlog = sascas1.OUTLOG_ind (replace = YES)
inobj = (
inEVENT = sascas1.TSMSuperBowl
)
outobj = (
outEVDUM = sascas1.SuperDummy (replace = YES)
outEVENT = sascas1.TSMSuperBowlExt (replace = YES)
)
errorstop = YES
;
id date interval=day ;
var obs;
require atsm;
submit;
/* the input table was produced by TSMODEL */
declare object inEVENT(inevent('VERSION','TSMODEL'));
declare object dataFrame(tsdf);
declare object eventDB(event);
rc = eventDB.Initialize();
/* Replay the previous outEVENT table, now inEVENT, to transfer the
event definitions to eventDB */
rc = eventDB.Replay(inEVENT);
rc = eventDB.eventcomb("SuperExt",
"event","N1W1FEBYR",
"event","LS01FEB2026D",
'RULE','MULT');
rc = eventDB.eventcomb('SuperBowlExt',
'event','SuperBowl',
'event','SuperExt');
/* collect the definitions */
declare object outEVENT(outevent);
rc = outEVENT.collect(eventDB);
rc = dataFrame.Initialize(); if rc < 0 then do; stop; end;
/* add all currently defined events (SuperBowl) to
the dataFrame.
Add 'class','all' to include combination events */
rc = dataFrame.AddEvent(eventDB, '_all_','class','all');
/* collect event dummy variable SuperBowl */
declare object outEvDum(outEventDummy);
rc = outEvDum.collect(dataFrame);
endsubmit;
run;
quit;
Notice in the above code that:
The OutEventDummy object stores output in a linear format because the events can differ by BY group and so are unknown at the beginning of processing. Extra work is required to format the output into block format. The code below shows how to transpose the output from a linear format into a block format. The resulting data set stores each dummy variable in a separate variable.
proc sort data=sascas1.SuperDummy out=SDSort;
by date;
run;
/* transpose the output to the usual block format for time series variables */
proc transpose data=SDSort
out=SuperDummyBlock(drop= _label_ _NAME_);
var X;
id _XVAR_;
by date;
run;
/* this is the usual (block) form of the event dummy variable output */
title "Extended Super Bowl Dates";
proc print data=SuperDummyBlock(where=((SuperBowlExt NE 0) AND (date GE '01JAN2026'D)));
run;
Examining the non-zero values of the extended dates, notice that:
When dates of events are subject to change, some maintenance will always be required to maintain accuracy in your forecasts. However, the ability to store information and change it easily can result in forecasts that are adaptive to change.
Excellent blog, Tammy! This is so useful!
A practical example, clearly explained -- thanks Tammy!
Thanks, @mvgilliland
Thanks, @BethEbersole
Join us for SAS Innovate 2025, our biggest and most exciting global event of the year, in Orlando, FL, from May 6-9. Sign up by March 14 for just $795.
Data Literacy is for all, even absolute beginners. Jump on board with this free e-learning and boost your career prospects.