BookmarkSubscribeRSS Feed
krejcia
Calcite | Level 5

Hello everybody and SAS-friends,

I would like to ask you about aspects of different data distribution using proc genmod. I‘m analyzing rate data (rate=counts/exposure). Because of overdispersion, I didn't used Poisson distribution but the negative binomial. I supposed that this kind of distribution changes my regression formula. If I used Poisson distribution it would be log(rate)=...  so I would use counts and add offset=log(exposure) just like in Gettting Started:Poisson Distribution. Do I have to change the offset when I use negative binomial distribution? I have only found in articles a regression formula for rates with binomial distribution: log(rate)/(1 - log(rate))=... so after mathematical treatment the offset for Counts variable would be log(Exposure - Counts). But what about the offset for negative binomial distribution? Is it the same like for binomial distribution?

Do you know where I could find the answer?

Thank you very much for any reply.

Annanomi.

2 REPLIES 2
Ksharp
Super User

My guess:

Since the negative binomial distribution's available value x=0,1,2,...........  like Possion distribution , so you can use offset= too as it is used in Possion Regression .

Xia Keshan

JacobSimonsen
Barite | Level 11

I agree that offset can be used in negative binomial regression as it is used in Poisson or any other regression model.

But, I do not neccessarily agree that negative binomial regression is a good model. I you estimate rate from survival data, and assuming piecewise constant hazard rates, then the likelihood function is the same as if you had observed poisson distributed Counts. This is not the same as assuming poisson distributed Counts. By using overdispersion as argument for chaning to negative binomial distrubution is a use of a distribution assumption that was not needed to be true. And further, the negative binomail likelihood can not be derived from a likelihood based on the time-to-events. Actually,  if you simulate data from a exponential distribution (that Means a constant hazard rate), then it is not ulikely that you will observe under- or over dispersion on aggregated count-data, even though all assumptions is truly satisfied.

So, to conclude, be very carefully when you change your model from poisson to negative binomial, the confidence intervals will become nonsense in the sense that you can determine them yourself by changing how much you aggregate your data. For example, if you aggregate your data on one additional binary covariate, then each cell is splitted into two. This will have a dramatic effect on confidence intervals when the negative binomial regression is used even though the covariate is not used in the model. The poisson regression will give unchanged estimates. this is because the negative binomial regression use as variance the observed variance. and this is not meaningfull if the original data was time-to-event (which was aggregated to Count data).

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

What is ANOVA?

ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 2 replies
  • 1336 views
  • 1 like
  • 3 in conversation