- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Hi! I am trying to use a stepwise selection to run my regression with the best possible variables. However, I have a huge problem with autocorrelation, which I fix for using
"proc autoreg data=xxx
model y=x1+x2/method=ml nlag=5 backstep"
When I do this procedure on the variables that the stepwise turned back to me, they often come back with very high p-values. So many question:
Is there a way to build the method of maximum likelihood into the stepwise regression? This would allow me to rid the model of autocorrelation while giving me the best variables under these autoregressive circumstances.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Hello,
Are you struggling with autocorrelation because you are estimating linear regression models for time series data or are you struggling with autocorrelation because your subsequent (cross-sectional) observations are just not independent (and hence the errors are autocorrelated)?
A good response to your question might / will depend on your answer to above question.
If you are dealing with time-series data or panel data (time-series cross-sectional data), there are better alternatives than PROC AUTOREG.
But PROC AUTOREG is interesting to perform estimation of different kinds of GARCH-type models.
Kind regards,
Koen
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
I am using time series data!
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
I am trying to use a stepwise selection to run my regression with the best possible variables.
I think you give stepwise way too much credit. It doesn't find the "best possible variables", whatever that means. This is a direct quote taken from a class taught by someone at SAS: "Stepwise selection was devised to provide a computationally efficient alternative to examining all subsets (of variables). It is not guaranteed to find the best subset (of variables) and it can be shown to perform badly in many situations (Harrell 1997).”
Paige Miller