Forecasting using SAS Forecast Server, SAS/ETS, and more

high collinearity among explanatory variables in forecasting models

Occasional Contributor
Posts: 14

high collinearity among explanatory variables in forecasting models

Hi, all:
I have set of explanatory variables with hig collinearity to build my time series regression model. One of my friend used ridge regression to deal with this problem in time series data. I never heard of people using ridge regresssion in time series modeling and I am very doubtful of his approach. Could you share with me the ways to deal with high collinearity among explanatory variables in time series forecasting models? Thanks.
SAS Employee
Posts: 416

Re: high collinearity among explanatory variables in forecasting models

Hello -
You might want to check out the new ENTROPY procedure of SAS/ETS - see:

Taken from there: "It is often the case that the statistical/economic model of interest is ill-posed or under-determined for the observed data. For the general linear model, this can imply that high degrees of collinearity exist among explanatory variables or that there are more parameters to estimate than observations available to estimate them. These conditions lead to high variances or non-estimability for traditional generalized least squares (GLS) estimates.
Under these situations it might be in the researcher’s or practitioner’s best interest to consider a nontraditional technique for model fitting. The principle of maximum entropy is the foundation for an estimation methodology that is characterized by its robustness to ill-conditioned designs and its ability to fit over-parameterized models.
Generalized maximum entropy (GME) is a means of selecting among probability distributions to choose the distribution that maximizes uncertainty or uniformity remaining in the distribution, subject to information already known about the distribution. Information takes the form of data or moment constraints in the estimation procedure. PROC ENTROPY creates a GME distribution for each parameter in the linear model, based upon support points supplied by the user. The mean of each distribution is used as the estimate of the parameter. Estimates tend to be biased, as they are a type of shrinkage estimate, but typically portray smaller variances than ordinary least squares (OLS) counterparts, making them more desirable from a mean squared error viewpoint"

Ask a Question
Discussion stats
  • 1 reply
  • 2 in conversation