Solved: Nested Variable- Did Not Converge Error Proc Glimmix

DanaM · Posted 12-09-2020 02:06 PM

I'm using proc glimmix to analyze calves born and calves weaned but it won't run when I have a nested variable in the model statement for calves born. The data is almost exactly the same for calves born and calves weaned, both are 0s and 1s with only a slight difference in the total amount of each between the variables, but for some reason proc glimmix will only converge my data for calves weaned with a nested variable. The models are exactly the same except for the dependent variable. They are:

birthed_calfnum= f2_animal_breed cow_age(f2_animal_birth_year) / dist=binomial link=logit;

random f2_animal_id;

and

weaned_calfnum= f2_animal_breed cow_age(f2_animal_birth_year) / dist=binomial link=logit;

random f2_animal_id;

Does anyone have any idea what's going on? Thank you!

SteveDenham · Posted 12-14-2020 10:50 AM

First, try these two things:

In the NLOPTIONS statement, change the optimization algorithm by adding tech=nridg.

Change the RANDOM statement to

random intercept/subject=f2_animal_id;

If that doesn't do it, then some other changes might work. Looking at the iteration history, I see that things are well behaved, followed by a blow-up, then followed by another well behaved portion, but where the objective function essentially comes back to the same area, followed by another blow-up, etc. That behavior is typical of quasi-separation for reml with binomial data. So, perhaps an integral approach would work (method=laplace or method=quad) in the PROC GLIMMIX statement. Or perhaps what Milliken and Johnson refer to as a means model (one-way model) would work. It is harder to get main effects out (you would need LSMESTIMATE statements with the right options to get the joint F tests), but it ought to converge. In this case your MODEL statement would look like:

model birthed_calfnum= f2_animal_breed*cow_age*f2_animal_birth_year/dist=binomial link=logit;

And then there is the generalized estimating equation (GEE) approach using alternating logistic regression. (See PROC GENMOD documentation, as you will want lsmeans). You would have a REPEATED statement that looks something like:

repeated  subject=f2_animal_id / logor=fullclust;

I don't know about this approach though.

SteveDenham

View solution in original post

SteveDenham · Posted 12-10-2020 10:05 AM

The default number of iterations for GLIMMIX is 20, and if it reaches that, it throws the "Did not converge" message. Do you have an NLOPTIONS statement? If not, add this line:

nloptions maxiter=500;

See if that solves the issue.

SteveDenham

DanaM · Posted 12-10-2020 12:33 PM

I added that line of code but it still didn't converge. I even tried 1000 instead of 500 just to see what would happen but it didn't work.

SteveDenham · Posted 12-11-2020 08:18 AM

Dang. I was hoping it was something simple. Well, first off, SAS parameterizes nested effects exactly like crossed effects in the X matrix, so that probably is not the issue. It might be a separation issue due to confounding. What does the 2 way crosstab from PROC FREQ tell you (with a TABLES statement like cow_age*f2_animal_birthyear)? It should not have a lot of zero cells like age=3, birthyear=2017 vs age=3, birthyear=2018. Just typing that out makes me think that age and birthyear aren't defined the way I thought, as you should have complete confounding of the two terms. Does f2_animal_birthyear refer to the calf's birthyear or the dam's birthyear? If it is the dam's birthyear, you probably do have confounding for calves born, but that due to losses prior to weaning that confounding may be lost. So, what to do? Could you consider age as a continuous variable, such that the age by birthyear is a separate slope for each birthyear?

Have you considered doing alternating logistic regression(ALR) using PROC GEE? If you are fitting a marginal model in GLIMMIX, then ALR in PROC GEE would be a possible way to get at this. If possible, please share your GLIMMIX code.

SteveDenham

DanaM · Posted 12-11-2020 02:12 PM

I've attached a pdf of my frequency table for f2 animal birth year and cow age. F2 animal birth year is the year the dam was born. I've also tried running the model with the year the calf was born instead and it still won't converge. I'm not sure about making birth year continuous. I will have to talk to my advisor about that. I'll also talk to him about using PROC GEE instead of PROC GLIMMIX too. My model statement is:

proc glimmix MAXLMMUPDATE=1000;
class f2_animal_id f2_animal_breed cow_age f2_animal_birth_year;
model birthed_calfnum= f2_animal_breed cow_age(f2_animal_birth_year)/dist=binomial link=logit;
random f2_animal_id;
lsmeans f2_animal_breed cow_age(f2_animal_birth_year)/ cl ilink;
run;

I've also attached a picture of the results I get when I run it. I included results for PROC GLM with almost the exact same model just to give you an idea of what my data looks like. Thank you so much for your help!

SteveDenham · Posted 12-14-2020 10:50 AM

First, try these two things:

In the NLOPTIONS statement, change the optimization algorithm by adding tech=nridg.

Change the RANDOM statement to

random intercept/subject=f2_animal_id;

If that doesn't do it, then some other changes might work. Looking at the iteration history, I see that things are well behaved, followed by a blow-up, then followed by another well behaved portion, but where the objective function essentially comes back to the same area, followed by another blow-up, etc. That behavior is typical of quasi-separation for reml with binomial data. So, perhaps an integral approach would work (method=laplace or method=quad) in the PROC GLIMMIX statement. Or perhaps what Milliken and Johnson refer to as a means model (one-way model) would work. It is harder to get main effects out (you would need LSMESTIMATE statements with the right options to get the joint F tests), but it ought to converge. In this case your MODEL statement would look like:

model birthed_calfnum= f2_animal_breed*cow_age*f2_animal_birth_year/dist=binomial link=logit;

And then there is the generalized estimating equation (GEE) approach using alternating logistic regression. (See PROC GENMOD documentation, as you will want lsmeans). You would have a REPEATED statement that looks something like:

repeated  subject=f2_animal_id / logor=fullclust;

I don't know about this approach though.

SteveDenham

DanaM · Posted 12-14-2020 12:27 PM

Thank you so much! I was able to get it to run with method=quad and method=laplace.

Nested Variable- Did Not Converge Error Proc Glimmix

Re: Nested Variable- Did Not Converge Error Proc Glimmix

Re: Nested Variable- Did Not Converge Error Proc Glimmix

Re: Nested Variable- Did Not Converge Error Proc Glimmix

Re: Nested Variable- Did Not Converge Error Proc Glimmix

Re: Nested Variable- Did Not Converge Error Proc Glimmix

Re: Nested Variable- Did Not Converge Error Proc Glimmix

Re: Nested Variable- Did Not Converge Error Proc Glimmix