I've configured two repeated measures models, one using PROC MIXED and the other using PROX GLIMMIX, as consistently as possible to the best of my understanding. The experiment is a factorial, RCBD.
proc mixed data=df plots=studentpanel;
class Trt_Amend_App Trt_CC Year_Time ID_S Block ID_S_P;
model Soil_PAN_LN = Trt_Amend_App | Trt_CC | ID_S | Year_Time;
repeated Year_Time / type=un sub=ID_S_P;
random Block(ID_S);
run;
proc glimmix data=df plots=studentpanel method=rspl;
class Trt_Amend_App Trt_CC Year_Time ID_S Block ID_S_P;
model Soil_PAN_LN = Trt_Amend_App | Trt_CC | ID_S | Year_Time;
random Year_Time / residual type=un subject=ID_S_P;
random Block(ID_S);
/*nloptions maxiter=500; /* Set this to avoid "did not yield a valid objective function error"*/
/*nloptions technique=nrridg;*/
run;
The within-subject correlations are complex, and unstructured yields the best (lowest) information criteria results (AIC, BIC). The MIXED model doesn't execute when using compound symmetry, giving the "WARNING: Stopped because of infinite likelihood" error.
I'm curious as to why I cannot get the GLIMMIX model to execute, and why I get the error "The initial estimates did not yield a valid objective function." every time that I attempt to run it. To address this I've attempted increasing the iterations (via INITITER and MAXITER, the latter of which is shown commented out). I've also tried changing the optimization technique to NRRIDG, which I've read is the default for PROC MIXED but not PROC GLIMMIX. None of that has worked, and I get the valid objective function error each time. I considered passing PARMS based on results from MIXED, but that seems unpractical given the amount of covariance parameters with unstructured correlation (28). Why might GLIMMIX be failing where MIXED works?
Also, it occurred to me that within-subject measurements may be so irregular that attempting to model within-subject correlation may be pointless. How might one verify whether there is any reason to use repeated measures period? When executing the MIXED model with ar(1) structure, the within-subject correlation is less than the estimate for residual, but not very low (see picture). I could use some practical advice in this regard. Thanks for reading!
I have a question about your model statement. You include several terms and all of the interactions up to the four-way. Just fitting the main effects would require solving for over 40 parameters (intercept + sum of (levels for each main effect - 1)). I don't want to guess how many are involved for your full model, but I would suspect it is at least 1000. How many observations do you have? The error message implies that you don't have enough data to estimate all of the parameters. I would suggest simplifying the model to include only main effects for a preliminary run, to see if you can get convergence. Then you could add in interactions (the @ option is very handy for this). I am hopeful that you could still get convergence. At the three-way level, you need a lot of observations to have sufficient power to detect differences, and in most cases I have worked with, a four-way interaction is indistinguishable from random noise. So - simplify and see what occurs.
SteveDenham
It is a good idea to include the LOG with the code and ALL the messages generated by a procedure. There are often clues provided in other messages.
Best to provide and example of the data as a working data step that behaves the same as your full data set. Note that sometimes the process of making one of these example sets and testing to see that the behavior is the same gives clues to a solution.
This error message -- "The initial estimates did not yield a valid objective function." can often be dealt with by a PARMS statement. You can use the ODS OUTPUT COVPARMS= statement in PROC MIXED to save the parameter estimates into a SAS data set, then in PROC GLIMMIX, use the PDATA= option in the PARMS statement to read in the initial parameter values.
As pointed out by @ballardw , it might be a good idea to include the Log, the Output, and the example data set so we can have a better idea of the situation. For example, depending on your data values, the subject= effect might be subject=id_s_p(block id_s) rather than subject=id_s_p. Also, we can understand better your "irregular" time points situation.
Thanks,
Jill
I have a question about your model statement. You include several terms and all of the interactions up to the four-way. Just fitting the main effects would require solving for over 40 parameters (intercept + sum of (levels for each main effect - 1)). I don't want to guess how many are involved for your full model, but I would suspect it is at least 1000. How many observations do you have? The error message implies that you don't have enough data to estimate all of the parameters. I would suggest simplifying the model to include only main effects for a preliminary run, to see if you can get convergence. Then you could add in interactions (the @ option is very handy for this). I am hopeful that you could still get convergence. At the three-way level, you need a lot of observations to have sufficient power to detect differences, and in most cases I have worked with, a four-way interaction is indistinguishable from random noise. So - simplify and see what occurs.
SteveDenham
Thanks @SteveDunham. There are ~900 observations, 128 per year-time, and 64 per site. Per treatment, there are only 4 observations. I'm still getting a feel for what is reasonable to test given the limits of the data available. Thanks
If I simplify the model by removing interactions and nesting, I can usually get things to work. However, in my modeling I want to be true to my experimental design. I was just surprised to see it work in mixed and not glimmix.
It's telling that the computer, a PC with decent computing ability, takes ~2 minutes to run the program with MIXED.
Thanks,
Fred
@wateas - Have you done any sensitivity checking for the PROC MIXED runs? What kind of results do you see if you use a grid search in a PARMS statement? Does it look like the convergence is insensitive to starting values, such that the final -2 log likelihood is the same no matter where you start. I respect wanting to reflect the experimental design in the analysis - I also believe in hierarchical interaction inclusion (e.g. a two way interaction requires the main effects are included in the model, a three-way requires the main effects and the 3 two-way interactions. etc.). In any case, the comments by @jiltao and @JackieJ_SAS both have some approaches that might help.
/* the following was added later, and I don't know if it would help with convergence issues */
One more idea - what about using a spline for the repeated measures variance-covariance structure to handle the irregular spacing? See Example 51.6 (SAS/STAT 15.2 documentation) Radial Smoothing of Repeated Measures Data, which looks at body weights of cows taken at unequally spaced time points. Time is treated as a continuous variable in this case. What I don't see in this example is a way to compare LSMEANS at various time points, but that could possibly be handled with an AT= option in LSMEANS or LSMESTIMATE statements.
SteveDenham
Hi, While something else might be going on, it is worthy to note that the default optimization methods are not identical in MIXED and GLIMMIX. MIXED uses Newton-Raphson with ridging (and you cannot change this). You can change the optimization method in GLIMMIX using the NLOPTIONS statement with TECHNIQUE option.
From the documentation I see this comment about the default optimization method in GLIMMIX:
The default optimization techniques are TECHNIQUE=NONE for noniterative estimation, TECHNIQUE=NEWRAP for singly iterative methods in GLMs, TECHNIQUE=NRRIDG for pseudo-likelihood estimation with binary data, and TECHNIQUE=QUANEW for other mixed models.
Personally, I had a simulation study I once ran that had widely different convergence rates depending on which method I specified on the NLOPTIONS statement. MIXED was also a comparator in my study. In the end, for my cases (which admittedly were unusual), I saw poor convergence rates for MIXED, the default in GLIMMIX, and many of the NLOPTIONS techniques, but did find one of the GLIMMIX techniques that had very high convergence rates. The moral is- convergence in GLIMMIX can change a lot based on what option you specify on the NLOPTIONS statement.
Hi @JackieJ_SAS , inquiring minds want to know where they could find the results to that simulation, please and thank you.🤔
SteveDenham
@SteveDenham, The overall simulation results are discussed in this paper:
https://pubmed.ncbi.nlm.nih.gov/26089186/
Unfortunately, I don't show the results for trying out different optimization methods from GLIMMIX and MIXED. We just summarize the best one. For our set cases --which were low number of independent observations with high numbers of repeated measures-- the best method turned out to be TECHNIQUE=DBLDOG. To be honest, I have no idea why that one worked best!
Thank you for asking!
Thanks @JackieJ_SAS !
The MATLAB website (https://www.mathworks.com/help/optim/ug/equation-solving-algorithms.html ) says:
The trust-region-dogleg algorithm is efficient because it requires only one linear solve per iteration (for the computation of the Gauss-Newton step). Additionally, the algorithm can be more robust than using the Gauss-Newton method with a line search.
There is also extensive discussion about trust region methods in general being able to find solutions (e.g, convergence to a minimum) even when the initial starting point isn't near the final solution.
SteveDenham
Catch the best of SAS Innovate 2025 — anytime, anywhere. Stream powerful keynotes, real-world demos, and game-changing insights from the world’s leading data and AI minds.
ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.
Find more tutorials on the SAS Users YouTube channel.