I am trying to analyze a huge dataset with 87,000 subjects. Each subject has 4 repeated measures (1 week, 3 weeks, 6 weeks, 2 months). The model we are developing has both G-side and R-side random effects (i.e. repeated measures and additional random effects). When I try to run the analysis, GLIMMIX gives a "Did not converge" error. I have tried all of the options GLIMMIX offers for controlling the optimization process e.g. changing the convergence criterion (pconv = 1e-4), increasing the maxiter parameter, and playing around with the options for controlling the inner iterations (available through the nloptions statement) but none of this works. But I noticed that if I take a small random sample of 6000 subjects (from the original 87000 sample size) and run the same model on this sub-sample, GLIMMIX converges. So I suspect the issue is that the sample size (87000 subjects) is just too large. Also, G-side random effects has 120 levels and the other has 8 levels, so the Z matrix has 128 columns in total. Coupled with the large sample size, this may be the source of the problem I'm experiencing. Is there an upper limit on the sample size and Z matrix size GLIMMIX can handle? I know the MIXED procedure has a high-performance version (for huge datasets) called HPMIXED, does GLIMMIX have this as well? A quick description of the study: The study is a multi-center trial (8 different hospitals) aimed at gauging the effectiveness of doctors strongly recommending that their patients quit smoking. For each patient, the doctor recommendation was given only once (during one of the patient's visit to the hospital). Each patient was then asked about their smoking status 1 week, 3 weeks, 6 weeks, and 2 months (hence the 4 repeated measures) after receiving the doctor recommendation. These 4 repeated measures form the R-side component of the model. We included the variables 'hospital' and 'doctor' (nested within 'hospital') as random effects and these formed the G-side component. As I mentioned earlier, the variable 'doctor' has 120 levels and the second G-side variable 'hospital' has 8 levels, producing a rather huge Z matrix with 128 columns. The code is given below: proc glimmix data=lib.data pconv=1e-4; class agegroup gender time user doctor hospital subjectID; model QuitSmoke(event='1') = agegroup gender time user time*user /dist=binary link=logit solution covb; random time /subject=subjectID residual type=ar(1); random doctor(hospital) hospital; nloptions technique=congra maxiter=1000 gconv=1e-4; run; Any help on this would be greatly appreciated! Thanks.
... View more