06-06-2013 12:20 PM
I am working with a multiple regression using 2 different mixed models. Each has the same predictor variables- one has a count response, and I am using poisson regression in PROC GLIMMIX and the other has a continuous response variable, so I am using "regular" regression in PROC MIXED. I have looked at pairwise plots of the variables to try to identify potential collinearity problems, and there appear to be some slight correlations. However, I am looking for a more standard way to make a decision about whether I need to eliminate variables or not.
I have done some reading about using criteria such as VIF, Condition Index, and Tolerance to assess multicollinearity. However, PROC MIXED and GLIMMIX do not compute these.
So here is my question:
1) Is it valid to use PROC REG (which will not handle random effects) to calculate VIF, CI and Tolerance that can be applied to the mixed model multiple regression? Clearly I don't really understand all the math here, so please use small words!
THANK YOU! And I will give credit to helpful answers.
06-06-2013 01:05 PM
Yes, you can use PROC REG to compute collinearity diagnostics. These collinearity diagnostics diagnostics involve only the X side of the equation, and so they don't change if the Y changes.
Another approach to collinearity is to use PROC PLS, which will give predictions and model coefficient estimates with lower mean square errors than what you would get from ordinary least squares in the presence of collinearity amongst the X variables (but the estimates are biased)
06-06-2013 01:15 PM
The one thing that you might not pick up one regarding collinearity and mixed models is what I call level confounding. Certain X values are observed only on certain levels of random effects. This will appear as a collinearity in PROC REG, but should not present a problem in mixed models, AS FAR AS ALGORITHM STABILITY. Of course, it also means that the random effect by treatment interaction probably needs to be fit as a random effect, and you wander off in search of enough data to make things converge meaningfully.
So use PROC REG to check diagnostics. Include the random effects. You'll probably have to run your data throug PROC GLMMOD first to get all of the class levels coded into dummy variables of the same parameterization that is used in MIXED and GLIMMIX. Interpret collinearity regarding random effects carefully.