- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Posted 09-16-2009 05:30 PM
(2986 views)
Hi,
There are numerous examples for using model selection (AIC, etc.) to select the best covariance structure for "proc mixed" models.
However, I am interested in ranking models with different fixed effects (not nested). It is my understanding that the REML procedure cannot be used for model selection unless the models are nested. If I use maximum likelihood (ML), the results can differ considerably from the same model estimated via REML.
Is there any procedure for ranking models with non-nested fixed effects using ML? I have searched around, but almost all of the model selection examples are for determining the best covariance structure.
cheers.
There are numerous examples for using model selection (AIC, etc.) to select the best covariance structure for "proc mixed" models.
However, I am interested in ranking models with different fixed effects (not nested). It is my understanding that the REML procedure cannot be used for model selection unless the models are nested. If I use maximum likelihood (ML), the results can differ considerably from the same model estimated via REML.
Is there any procedure for ranking models with non-nested fixed effects using ML? I have searched around, but almost all of the model selection examples are for determining the best covariance structure.
cheers.
5 REPLIES 5
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
When you specify ML, you still get information criteria (AIC, etc.) which can be employed to rank models. All the usual caveats about using information criteria to select a model would still apply.
Dale
Dale
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
One additional thing to check is that PROC MIXED is using the exact same observations for each set of fixed effects before comparing using information criteria. For example, if one set of fixed effects has a different pattern of missing data than the other set you would want to limit your data to those observations that are not missing any of the fixed effects you wish to compare.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Thank you for both your comments. There is no problem with missing data. I guess my question regarding ML and model selection is that ML is often mentioned as being biased. The best I could glean off of the web is that the parameter estimates are the same, but the SE, etc. are often smaller. One internet board suggested using ML to rank models, and then estimating the fixed effects of the best model via REML.
I was wondering if anyone has done model selection using ML, and if there is a recommended procedure for 1. estimating the covariance, 2. ranking the models via ML, and 3. how to estimate the best model?
I was wondering if anyone has done model selection using ML, and if there is a recommended procedure for 1. estimating the covariance, 2. ranking the models via ML, and 3. how to estimate the best model?
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
It is incorrect to believe that the parameter estimates are the same for ML as they are for REML. The parameter estimates may be asymptotically the same, but in a finite sample you would expect the parameter estimates for ML and REML to differ.
The approach whereby you find the "best" model for your fixed effects using IC to rank models fitted employing ML and subsequently re-estimating the parameters of the "best" model using REML estimation is acceptable. However, different IC can lead to different "best" models. Also, this general approach is very much like stepwise regression which is known to be problematic. It is usually better to employ a model which theory specifies a priori.
The approach whereby you find the "best" model for your fixed effects using IC to rank models fitted employing ML and subsequently re-estimating the parameters of the "best" model using REML estimation is acceptable. However, different IC can lead to different "best" models. Also, this general approach is very much like stepwise regression which is known to be problematic. It is usually better to employ a model which theory specifies a priori.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
I have a similar issue where I have a set of apriori hypothesis with fixed and random effects. The REML estimator only counts the parameters from the random effects when it calculates the bias correction term for AIC, thus you get different AIC values when using the REML procedure or when you calculate AIC manually.