BookmarkSubscribeRSS Feed
Fluorite | Level 6

I have a dataset with visitors and weather variables. I'm trying to forecast visitors based on the weather variables. Since the dataset only consists of visitors in season there is missing values and gaps for every year. When running proc reg in sas it's all okay but the issue comes when i'm using proc VARMAX. I cannot run the regression due to missing values. How can i tackle this?

proc varmax data=tivoli4 printall plots=forecast(all);

id obs interval=day; model lvisitors = rain sunshine averagetemp dfebruary dmarch dmay djune djuly daugust doctober dnovember ddecember dwednesday dthursday dfriday dsaturday dsunday d_24Dec2016 d_05Dec2013 d_24Dec2017 d_24Dec2014 d_24Dec2015 d_24Dec2019 d_24Dec2018 d_24Sep2012 d_06Jul2015 d_08feb2019 d_16oct2014 d_15oct2019 d_20oct2016 d_15oct2015 d_22sep2017 d_08jul2015 d_20Sep2019 d_08jul2016 d_16oct2013 d_01aug2012 d_18oct2012 d_23dec2012 d_30nov2013 d_20sep2014 d_17oct2012 d_17jun2014 dFrock2012 dFrock2013 dFrock2014 dFrock2015 dFrock2016 dFrock2017 dFrock2018 dFrock2019 dYear2015 dYear2016 dYear2017 /p=7 q=2 Method=ml dftest; garch p=1 q=1 form=ccc OUTHT=CONDITIONAL; restrict ar(3,1,1)=0, ar(4,1,1)=0, ar(5,1,1)=0, XL(0,1,13)=0, XL(0,1,14)=0, XL(0,1,13)=0, XL(0,1,27)=0, XL(0,1,38)=0, XL(0,1,42)=0; output lead=10 out=forecast; run;

Jade | Level 19

You don't say whether the missing values are in the dependent variable or the independent variables.  If in the dependent variable, I would say to delete those observations from the fitting part (you could use them to predict values and check the appropriateness of your model).  For the independent values, you might look at PROC EXPAND, which would enable interpolation of missing values.

Also, how many records do you have?  Since you are looking at 54 independent variables, you probably need around 300 complete observations to get stable results.

Since VARMAX does not support missing values, interpolation or multiple imputation are the only alternatives I see for a vector autoregressive model using VARMAX. You might look at some of the state space examples to see if they can provide you with a road forward.

Also, try asking this over in the Forecasting & Econometrics community.  Perhaps someone with more experience in time series analysis could give you some help.




Don't miss out on SAS Innovate - Register now for the FREE Livestream!

Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.


Register now!

What is ANOVA?

ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 1 reply
  • 2 in conversation