- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Hi
I recorded survival of my study animals up to 5 weeks allocated to three different treatments over a two year period in a binomial format. Now i am trying to evaluate the effect of year, genotype, sex and treatment on survival but when i run the model below, the GLIMMIX procedure does not converge, any help please:
Proc glimmix data= surv;
class year camp gene sex treatment;
model survival = year|genotype|sex|treatment / solution ddfm=res dist=binomial link=probit;
random camp;
run; quit;
Log output:
NOTE: Some observations are not used in the analysis because of: missing fixed effects (n=1).
NOTE: Did not converge.
NOTE: The SAS System stopped processing this step because of errors.
NOTE: PROCEDURE GLIMMIX used (Total process time):
real time 3.59 seconds
cpu time 3.16 seconds
Any who can help me please, what other fixed effects are missing?
Accepted Solutions
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
My responses assume that I understand your study design correctly. I think I do, but there also could be some important detail that you have not yet provided. It does seem that year (and gene) are whole plot factors in this study, and that treatment (and sex) are subplot factors.
The second error term can be specified as the interaction of the whole plot unit and the subplot factor(s).
For easy reference, here is the model I suggested previously:
proc glimmix data=survival method=laplace;
class year pen gene sex treatment;
model survival6w (descending) = year gene treatment sex / dist=binary;
random intercept treatment*sex / subject=pen(year gene);
lsmeans year gene treatment sex / ilink;
run;
For your study, the whole plot error is variance among pens, specified by the combination of intercept and subject=pen(year gene) in the RANDOM statement. The subplot error is variance among groups of animals within pens treated alike, specified by the combination of treatment*sex and subject=pen(year gene). An equivalent syntax for the RANDOM statement that you might find more intuitive is
random pen(year gene) treatment*sex*pen(year gene);
treatment*sex*pen(year gene) is an interaction of the subplot factors treatment and sex and the whole plot unit pen(year gene).
I'm not particularly enthused about the results from the model above, but the data are so sparsely allocated across the design and most animals survived, so there is just not much to work with.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
A common reason for lack of convergens is too small a sample size. You have 24 degrees of greedome in fixed effects and an unknown addition in random effects. Can you share your code and data?
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
The original poster provided a bit more design information here
but really not enough to address the questions. @Education, more details please. I doubt that one (n=1) missing observation causes lack of convergence; I might guess that it's a mismatch between the data structure and the model--for example, too much model and too few observations as suggested by @Doc_Duke. Lacking more information about the study, that's just a guess.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
@Doc_Duke, i have attached the data on the previous post above, the code i am trying to run is as follows:
Proc glimmix data= survival;
class year pen gene sex treatment;
model survival6w= year gene sex treatment/ solution ddfm=res dist=binomial link=probit;
random pen;
run; quit;
Thank you for your help.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
From the data sheet, the response variable (survival6w) in this case (1: animals that survived up to 6 weeks; 0: did not survive), the pen is the kraal from which the animals studied were born (thus, some animals sharing a similar pen in a year were born from the same parents). However, the parents were switched from one pen to another in different years because it has been observed that pen also influence production parameters of parents.
I hope it helps explaining the data, thank you.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
In your dataset, there are 29 observations for survival6w=0 and 386 for survival6w=1. The crosstabulation of survival6w by year, gene, sex, and treatment shows that there are zero observations for most combinations of your explanatory variables. See the attached file. This sparseness limits what you can do with the model (for example, whether you can include interactions) and whether estimation will be successful (i.e., whether the model will converge). Intuitively, if 93% of animals survive, we can imagine that it will be hard to assess whether explanatory factors can statistically distinguish between the two outcomes (alive or dead); after all, if we predict survival in all cases, we will be right 93% of the time, it's hard to improve on that naive estimate. Lack of variability in the response results in lack of predictive or explanatory capability.
Pen numbers are re-used in the two years, so you have to specify pen(year) to identify each pen uniquely. Each pen is unique for one level of gene. There are sometimes multiple combinations of treatment and sex within each pen, but all pens are incomplete for all combinations (i.e., each pen is missing one or more treatment x sex combinations). There are multiple animals within each treatment x sex combination in a given pen which probably should be thought of as subsamples if you are thinking of pens as replicates for year and gene, and as blocks for treatment and sex. I would consider starting (but not necessarily ending) with this model:
proc glimmix data=survival method=laplace;
class year pen gene sex treatment;
model survival6w (descending) = year gene treatment sex / dist=binary;
random intercept treatment*sex / subject=pen(year gene);
lsmeans year gene treatment sex / ilink;
run;
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Thank you @sld, the pens were not acting as replicates for year or gene but they were the kraals of origin, where the study animals originated. The allocation of parent animals to pens was that a pair should not stay in the same pen for two consecutive seasons. Then the offspring were allocated to different groups randomly, so to control for relatedness between offspring in different treatments i was using pen as a random variable.
Also i see you aded "ilink" on the lsmeans statement, what does it do in this case?
Thank you.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Is it true that each pen is essentially the mating of one male and one female (the pair represents a particular level of GENE) in a given YEAR, resulting in multiple offspring which are of different SEXes and assigned to different TREATMENTs? What exactly do the levels of GENE represent?
In your study, what do you think serves as an independent replication? You may find it useful to ponder this question in the light of a split plot design, if my understanding your study design is correct.
The documentation for the LSMEANS statement in the GLIMMIX procedure will describe the function of the ILINK option.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Yes it is true. Gene represents the genotype of the offspring which in some cases the parents were of different genotype, so the offspring of crosses were grouped together as same genotype regardless of the reciprocal crossing.
Your understanding is very correct, the split plot idea i had it before and i used it for some other analysis, but in this case i was restricted by the fact that the response is binary and the convergence problem i was getting.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Ah, yes, we do sometimes run into convergence problems with generalized linear mixed models--for example, a split plot design with a binary response such as yours.
The code I posted previously implements a split-plot design with subsamples. Pen is the whole plot unit; year and gene are whole plot factors. A group of animals within a pen assigned to the same treatment x sex combination is the subplot unit; treatment and sex are subplot factors. Multiple animals within the same group from the same pen assigned to the same treatment x sex combination are subsamples. The code uses the data set you posted, and it converges even though the design is very unbalanced, with reasonable-looking results (but no evidence that any of the experimental factors is associated with survival).
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
The split plot i was running i used year as a whole plot factor and treatment within year as my subplot factor but the model did not converge, i think i was putting too much information in the model and also pen number was my random variable. From my understanding of a split plot there should be two error terms, but from your code above i cannot see the interaction between your whole plot factor and subplot factor as the second error term.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
My responses assume that I understand your study design correctly. I think I do, but there also could be some important detail that you have not yet provided. It does seem that year (and gene) are whole plot factors in this study, and that treatment (and sex) are subplot factors.
The second error term can be specified as the interaction of the whole plot unit and the subplot factor(s).
For easy reference, here is the model I suggested previously:
proc glimmix data=survival method=laplace;
class year pen gene sex treatment;
model survival6w (descending) = year gene treatment sex / dist=binary;
random intercept treatment*sex / subject=pen(year gene);
lsmeans year gene treatment sex / ilink;
run;
For your study, the whole plot error is variance among pens, specified by the combination of intercept and subject=pen(year gene) in the RANDOM statement. The subplot error is variance among groups of animals within pens treated alike, specified by the combination of treatment*sex and subject=pen(year gene). An equivalent syntax for the RANDOM statement that you might find more intuitive is
random pen(year gene) treatment*sex*pen(year gene);
treatment*sex*pen(year gene) is an interaction of the subplot factors treatment and sex and the whole plot unit pen(year gene).
I'm not particularly enthused about the results from the model above, but the data are so sparsely allocated across the design and most animals survived, so there is just not much to work with.