Statistical Procedures

Programming the statistical procedures from SAS
BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
Education
Obsidian | Level 7

Hi

 

I recorded survival of my study animals up to 5 weeks allocated to three different treatments over a two year period in a binomial format. Now i am trying to evaluate the effect of year, genotype, sex and treatment on survival but when i run the model below, the GLIMMIX procedure does not converge, any help please:

 

 

Proc glimmix data= surv;
class year camp gene sex treatment;
model survival = year|genotype|sex|treatment / solution ddfm=res dist=binomial link=probit;
random camp;
run; quit;

 

 

Log output:

NOTE: Some observations are not used in the analysis because of: missing fixed effects (n=1).
NOTE: Did not converge.
NOTE: The SAS System stopped processing this step because of errors.
NOTE: PROCEDURE GLIMMIX used (Total process time):
real time 3.59 seconds
cpu time 3.16 seconds

 

Any who can help me please, what other fixed effects are missing?

1 ACCEPTED SOLUTION

Accepted Solutions
sld
Rhodochrosite | Level 12 sld
Rhodochrosite | Level 12

My responses assume that I understand your study design correctly. I think I do, but there also could be some important detail that you have not yet provided. It does seem that year (and gene) are whole plot factors in this study, and that treatment (and sex) are subplot factors.

 

The second error term can be specified as the interaction of the whole plot unit and the subplot factor(s). 

 

For easy reference, here is the model I suggested previously:

 

proc glimmix data=survival method=laplace;
    class year pen gene sex treatment;
    model survival6w (descending) = year gene treatment sex / dist=binary;
    random intercept treatment*sex / subject=pen(year gene);
    lsmeans year gene treatment sex / ilink;
    run;

For your study, the whole plot error is variance among pens, specified by the combination of intercept and subject=pen(year gene) in the RANDOM statement. The subplot error is variance among groups of animals within pens treated alike, specified by the combination of treatment*sex and subject=pen(year gene). An equivalent syntax for the RANDOM statement that you might find more intuitive is

 

random pen(year gene) treatment*sex*pen(year gene);

 

treatment*sex*pen(year gene) is an interaction of the subplot factors treatment and sex and the whole plot unit pen(year gene).

 

I'm not particularly enthused about the results from the model above, but the data are so sparsely allocated across the design and most animals survived, so there is just not much to work with. 

 

View solution in original post

12 REPLIES 12
Doc_Duke
Rhodochrosite | Level 12

A common reason for lack of convergens is too small a sample size.  You have 24 degrees of greedome in fixed effects and an unknown addition in random effects.  Can you share your code and data?

sld
Rhodochrosite | Level 12 sld
Rhodochrosite | Level 12

The original poster provided a bit more design information here

 

https://communities.sas.com/t5/SAS-Statistical-Procedures/Split-plot-with-covariate/m-p/359891#M1899...

 

but really not enough to address the questions. @Education, more details please. I doubt that one (n=1) missing observation causes lack of convergence; I might guess that it's a mismatch between the data structure and the model--for example, too much model and too few observations as suggested by @Doc_Duke. Lacking more information about the study, that's just a guess.

Education
Obsidian | Level 7
 
Education
Obsidian | Level 7

@Doc_Duke, i have attached the data on the previous post above, the code i am trying to run is as follows:

 

Proc glimmix data= survival;
class year pen gene sex treatment;
model survival6w= year gene sex treatment/ solution ddfm=res dist=binomial link=probit;
random pen;
run; quit;

 

 

Thank you for your help.

 

Education
Obsidian | Level 7

From the data sheet, the response variable (survival6w) in this case (1: animals that survived up to 6 weeks; 0: did not survive), the pen is the kraal from which the animals studied were born (thus, some animals sharing a similar pen in a year were born from the same parents). However, the parents were switched from one pen to another in different years because it has been observed that pen also influence production parameters of parents.

 

I hope it helps explaining the data, thank you.

sld
Rhodochrosite | Level 12 sld
Rhodochrosite | Level 12

In your dataset, there are 29 observations for survival6w=0 and 386 for survival6w=1. The crosstabulation of survival6w by year, gene, sex, and treatment shows that there are zero observations for most combinations of your explanatory variables. See the attached file. This sparseness limits what you can do with the model (for example, whether you can include interactions) and whether estimation will be successful (i.e., whether the model will converge). Intuitively, if 93% of animals survive, we can imagine that it will be hard to assess whether explanatory factors can statistically distinguish between the two outcomes (alive or dead); after all, if we predict survival in all cases, we will be right 93% of the time, it's hard to improve on that naive estimate. Lack of variability in the response results in lack of predictive or explanatory capability.

 

Pen numbers are re-used in the two years, so you have to specify pen(year) to identify each pen uniquely. Each pen is unique for one level of gene. There are sometimes multiple combinations of treatment and sex within each pen, but all pens are incomplete for all combinations (i.e., each pen is missing one or more treatment x sex combinations). There are multiple animals within each treatment x sex combination in a given pen which probably should be thought of as subsamples if you are thinking of pens as replicates for year and gene, and as blocks for treatment and sex. I would consider starting (but not necessarily ending) with this model:

 

proc glimmix data=survival method=laplace;
    class year pen gene sex treatment;
    model survival6w (descending) = year gene treatment sex / dist=binary;
    random intercept treatment*sex / subject=pen(year gene);
    lsmeans year gene treatment sex / ilink;
    run;

 

Education
Obsidian | Level 7

Thank you @sld, the pens were not acting as replicates for year or gene but they were the kraals of origin, where the study animals originated. The allocation of parent animals to pens was that a pair should not stay in the same pen for two consecutive seasons. Then the offspring were allocated to different groups randomly, so to control for relatedness between offspring in different treatments i was using pen as a random variable.

 

Also i see you aded "ilink" on the lsmeans statement, what does it do in this case?

 

 

Thank you.

sld
Rhodochrosite | Level 12 sld
Rhodochrosite | Level 12

Is it true that each pen is essentially the mating of one male and one female (the pair represents a particular level of GENE) in a given YEAR, resulting in multiple offspring which are of different SEXes and assigned to different TREATMENTs? What exactly do the levels of GENE represent?

 

In your study, what do you think serves as an independent replication? You may find it useful to ponder this question in the light of a split plot design, if my understanding your study design is correct.

 

The documentation for the LSMEANS statement in the GLIMMIX procedure will describe the function of the ILINK option.

 

Education
Obsidian | Level 7

Yes it is true. Gene represents the genotype of the offspring which in some cases the parents were of different genotype, so the offspring of crosses were grouped together as same genotype regardless of the reciprocal crossing.

 

Your understanding is very correct, the split plot idea i had it before and i used it for some other analysis, but in this case i was restricted by the fact that the response is binary and the convergence problem i was getting.

sld
Rhodochrosite | Level 12 sld
Rhodochrosite | Level 12

Ah, yes, we do sometimes run into convergence problems with generalized linear mixed models--for example, a split plot design with a binary response such as yours.

 

The code I posted previously implements a split-plot design with subsamples. Pen is the whole plot unit; year and gene are whole plot factors. A group of animals within a pen assigned to the same treatment x sex combination is the subplot unit; treatment and sex are subplot factors. Multiple animals within the same group from the same pen assigned to the same treatment x sex combination are subsamples. The code uses the data set you posted, and it converges even though the design is very unbalanced, with reasonable-looking results (but no evidence that any of the experimental factors is associated with survival).

Education
Obsidian | Level 7

The split plot i was running i used year as a whole plot factor and treatment within year as my subplot factor but the model did not converge, i think i was putting too much information in the model and also pen number was my random variable. From my understanding of a split plot there should be two error terms, but from your code above i cannot see the interaction between your whole plot factor and subplot factor as the second error term.

 

 

 

 

sld
Rhodochrosite | Level 12 sld
Rhodochrosite | Level 12

My responses assume that I understand your study design correctly. I think I do, but there also could be some important detail that you have not yet provided. It does seem that year (and gene) are whole plot factors in this study, and that treatment (and sex) are subplot factors.

 

The second error term can be specified as the interaction of the whole plot unit and the subplot factor(s). 

 

For easy reference, here is the model I suggested previously:

 

proc glimmix data=survival method=laplace;
    class year pen gene sex treatment;
    model survival6w (descending) = year gene treatment sex / dist=binary;
    random intercept treatment*sex / subject=pen(year gene);
    lsmeans year gene treatment sex / ilink;
    run;

For your study, the whole plot error is variance among pens, specified by the combination of intercept and subject=pen(year gene) in the RANDOM statement. The subplot error is variance among groups of animals within pens treated alike, specified by the combination of treatment*sex and subject=pen(year gene). An equivalent syntax for the RANDOM statement that you might find more intuitive is

 

random pen(year gene) treatment*sex*pen(year gene);

 

treatment*sex*pen(year gene) is an interaction of the subplot factors treatment and sex and the whole plot unit pen(year gene).

 

I'm not particularly enthused about the results from the model above, but the data are so sparsely allocated across the design and most animals survived, so there is just not much to work with. 

 

sas-innovate-wordmark-2025-midnight.png

Register Today!

Join us for SAS Innovate 2025, our biggest and most exciting global event of the year, in Orlando, FL, from May 6-9. Sign up by March 14 for just $795.


Register now!

What is ANOVA?

ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 12 replies
  • 14508 views
  • 0 likes
  • 3 in conversation