03-29-2016 06:14 PM
I am using proc glimmix for a binary response data with 18 fixed effects, a random effect and 400000 data points. I am using the following code:
proc glimmix data = mydata initglm scoring=5;
class subject v10 v13;
model y = v1 v2 ... v18/dist=binary ddfm=residual solution;
random int v1 v2 ... v100/ sub=subject;
I have these warnings for the scaled data:
WARNING: A new singularity occurred in column 27 of X`Inv(V)X.
WARNING: Obtaining minimum variance quadratic unbiased estimates as starting values for the
covariance parameters failed
After removing the scale for some variables, I don't have this problem anymore; but, the resulted estimates don't make sense.
Deos scaling create linear relationship between some variables in my data? The variance is large in some of my variables and I need to scale them.
I appreciate your help,
03-30-2016 09:27 AM
I realize that you provided pseudocode, but it looks like you are putting v1-v18 in both the MODEL and RANDOM statement. An effect is either FIXED or RANDOM, so make sure your real code models those variables in one way or the other.
03-30-2016 01:50 PM
Thanks a lot for your comments,
Actually, I am using some of my fixed effects as random variables to capture variability between subjects. I am not using all the fixed variables as random varaible (V1..V18 in random effects and in fixed effects are not exactly the same). I have the convergence problem, and I have used different options to solve this problem (like changing pconv, gconv, ... using other methods like LAPLACE,and..). I could make it converge; but, the z score are infinite for some variables which is probably due to colinearity and the zero error!
You have provided a document to solve the convergence problem with GLIMMIX. I tried everything in this document; but, still I am either having the convergence problem or results which don't make sense to me.
I appreciate any help
03-30-2016 01:57 PM
Given all that you have tried, I fear that it is a collinearity problem--and if the random effects are collinear, or nearly collinear, then you may not have enough data to distinguish them, with the resulting 0's and infinities and all the rest of the problems associated.
The other possibility is quasi-complete separation, but with only 2 class variables, this may not be the case. What does the cross-tabulation for v10*v13*y look like?
03-31-2016 10:53 AM
Interesting point, @lvm. So let's suppose there are 100 (well 98) different continuous random effects where you are estimating a random slope, and that none of these are correlated (best case) and that you have a minimum of 3 points to estimate each slope. Is the minimum number of modeled events over each of the 100 random effects at least 150 (50 events times 3 points)? If not, there may be algorithmic problems.
Is there any way at all to eliminate some of the random effects by converting to events/trials syntax, and consolidating over the random effects that may not be as interesting?
03-31-2016 05:07 PM
Thanks so much for your comments,
I think Steve's point about the colinearity between my variables is true. I think the reason for singularity problem and getting infinite values for z scores is the colinearity between v10, v13, and y. Of course, I have used the interactions of v13 with some other variables, and probably some interactions are colinear with v10 and y. Now, I see why scaling some variables creates the singularity and infinite z scores problems, because it makes their interactions with v13 be more correlated with y when they are scaled. Am I right?
Also, I have used 8 continuous variables as random effects, and the rest of them are binary variables (not dummy though).
Do you have any suggestions to solve this problem and get reasonable responses?
03-31-2016 05:11 PM
Also, when I am not defining one of v10 or v13 as categorical variables I am not having this problem anymore
v10 has three levels and v13 has two levels.
04-01-2016 12:29 PM
That last part sounds like quasi-separation--once the categories are ignored for these variables, responses are no longer segregated into levels such that a level has only a single outcome. So it comes down to that 3-category variable in all likelihood.