Hello
I am modeling a competition assay, where two competitors are inoculated in a “+” formation and I evaluate whether the vertical genotype wins as a binary response variable. I have two fixed main predictors each for the vertical and horizontal genotype (Envir and Protein, and each has 2 levels), so for each competition (“trial”) there is a VertEnv, HorizEnv, VertProtein, HorizProtein.) I have genotypes, which come from different environments, as random factors, and replication at the level of the Vert_genotype*Horiz_genotype combination. There is almost complete consistency among the three replicates for each such combination. This is an all-by-all assay, so the Vert_genotypes and Horiz_genotypes appear multiple times in different combinations. There is imbalance in both fixed predictors such that for a few levels of the 4-way fixed effect interaction there is only one combination of genotypes yielding a particular outcome but there are instances of all possible combinations of the fixed effects yielding at least a few trials with both outcomes. I am interested in a number of 2-way interactions among the fixed factors but fear there may be important 3-way or that the 4-way is important).
At this point my model is:
Proc glimmix data=xxx method=laplace; Class Horiz_geno Vert_geno Venv Henv VProt HProt;
Model Win_V (event=’1’) = Venv|Henv|VProt|HProt
/ ddfm=bw link=logit dist=binary;
random Vert_genotype(Venv) Horiz_genotype(Henv) Vert_genotype*Horiz_genotype(Venv*Henv);
lsmeans Venv*Henv*VProt*HProt / pdiff OR adjust=tukey ilink CL;
nloptions tech=nrridg maxiter=500;
run;
*(LaPlace based on Bolker et al 2008, Trends in Ecol and Evol, among others);
*(bw based on various message boards, where posts suggest bw ddfm for LaPlace with binary)
My questions are;
Thanks for your insights!
Be very careful with Laplace or Quadrature with binary or binomial data. If any of the random effects are 0, then everything blows up (without any warnings). Many of the standard errors of LSMEANS are 0 or go to zero, which means you get meaningless results. The Type III test could be extremely inlfated. You must remove any random effects with a 0 variance to get sensible results. THis is not an issue with RSPL estimation.
Early work suggested that RSPL could give biased results with binary response, so many have suggested Laplace or quadrature. However, recent work by Stroup shows that this is not always the case. My view, based on the most recent evidence, that there is no clear-cut best estimation method. But I do lean towards Laplace for purely binary repsonses. Just make sure you don't have any zero variances.
Thanks so much for the warning! I actually have a sort of "sister model" with a simpler layout with which I was having similar symptoms, and it did indeed have random variables (factorial design) with zero variances, and removing them did make it look much more reasonable. But that wasn't the problem with the one I posted about. Your suggestion got me thinking again about my random statements though, and I wondered whether the fact that for a particular vertical genotype-horizontal genotype combination the three replicates were very frequenty identical in outcome, so I make consensus calls for each genotype-combination and removed that interaction from the model, and the confidence intervals look more reasonable too. Does this make sense?
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.
Find more tutorials on the SAS Users YouTube channel.