Hi all,
I'm using ZINB models to examine count data in 44 youth who are part of a summer research treatment program. The dataset is highly unbalanced as some youth have many time-outs, and others have only a few. Given this, I'm including an offset variable to account for the # of time-outs a child got within a day. Given the overdispersion and excess zeros, ZINB appears most appropriate and I'm using a LINK = LOG function. Does my offset variable also need to be logged? I'm seeing mixed responses online. My code is below.
Thanks much!
PROC GENMOD DATA = TIMEOUT;
CLASS Meds;
MODEL Behavior = time X1 X2 X3 X4 / link = log dist = zinb; offset = timeout_withindaycount;(does this need to log transformed?)
ZEROMODEL time X1 X2 X3 X4;
RUN;
The purpose of using an offset in a model on a count response is so that you can model a rate - the ratio of the mean count to some exposure or population amount. See this note that discusses it. As shown there, in a log-linked model the denominator variable of the desired rate should also be log transformed. Assuming that your BEHAVIOR variable is a count and if your intent is to model the rate defined by BEHAVIOR/TIMEOUT_WITHINDAYCOUNT, then you need to log-transform the timeout variable. See this note that discusses and illustrates modeling mean counts and rates with zero-inflated models.
The purpose of using an offset in a model on a count response is so that you can model a rate - the ratio of the mean count to some exposure or population amount. See this note that discusses it. As shown there, in a log-linked model the denominator variable of the desired rate should also be log transformed. Assuming that your BEHAVIOR variable is a count and if your intent is to model the rate defined by BEHAVIOR/TIMEOUT_WITHINDAYCOUNT, then you need to log-transform the timeout variable. See this note that discusses and illustrates modeling mean counts and rates with zero-inflated models.
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.
Find more tutorials on the SAS Users YouTube channel.