Greetings. I am hoping to gain some understanding of a problem I'm running into in analyzing data from digital microscopy. I'll be brief to start because the information needed may be simple, but I am willing to provide more details if someone is interested. I suspect the basic problem here is rather routine, however, and just new to me in this new endeavor.
Goal: compare staining of tissue between species and infection status.
Method: digital acquisition and analysis by color separation resulting in a measure (in pixels, so count data) of total tissue present (denominator) and total stain present (not intensity, just is it present or not in that pixel). The response variable is the proportion of tissue stained. HOWEVER, there is a differing amount of 'background' staining between species. So the approach considered is to subtract species-specific background from each proportion. On the surface of it, most staining is a substantial proportion relative to background. However, for some stains, proportion of staining is relatively low and sometimes absent (better thought of as not detected). This sometimes results in background-corrected 'proportions' that are slightly negative or zero.
Analysis: At first glance I would think to apply beta-regression and this seems to work when all proportions of a stain are greater than zero. But data drop out when zero or negative. I don't think this is a situation of zero-inflation or a hurdle model as I've understood them. I am tempted to re-scale the data so that all data fit into a (0,1) non-inclusive space. Others in my group want to just change the negative values to zero and run as if a normal distribution. Perhaps that data just can't be analyzed in a way that makes sense and its all about discovery of why there are images with less than proportional background.
general code that works for the stains with >0 background-corrected proportions
Proc glimmix ;
model proportion = species*infection / distribution=beta solution ;
run;
I would appreciate any guidance to help me get started.
Dave
Another approach would be to leave the backgrounds in, and to compare the differences between species means to the differences in their backgrounds.
Another approach would be to leave the backgrounds in, and to compare the differences between species means to the differences in their backgrounds.
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.
Find more tutorials on the SAS Users YouTube channel.