10-27-2016 12:17 PM
I've been working thru some of the chapters in the Handbook of Statistical Analyses using SAS book. In chapter 9 section 3.2 Analyzing FAP Data, they use Proc Genmod with the Gamma distribution because there are some rather large counts. I've done some googling but can't really find an explanation on when I should use the Gamma distribution as opposed to Poisson or Negative Binomial. When I analyze the FAP data using Poisson distribution I get very different results than with the Gamma.
Any thoughts greatly appreciated!
The code and data can be downloaded here:
10-27-2016 02:26 PM
I don't have the book, so I can't comment on that example. However, there is an example in the PROC GENMOD doc that discusses using a gamma link for survival-type data. Maybe relevant?
10-27-2016 11:34 PM
Yeah. Rick direct you an good example. Gamma is usually for Positive Value(like age , survival time ...).
Poisson or Negative Binomial is usually for sparse/discrete data, if the happened proabiltiy of event is very small , you should consider to use Poisson or Negative Binomial.
10-30-2016 11:55 AM
The gamma is a 2-parameter continuous distribution for 0 < y < infinity. Generally, it is used for right-skewed distributions (a special case is the 1-parameter exponential distribution). In principle, it would not be used for counts, which are discrete. The Poisson and negative binomial are two common and popular distributions for count data. However, these distributions can be problematic when the counts are very large. For example, with bacterial cell counts, one might have values such as 10^4, 10^5, 10^8, and so on. In principle, every value from 0 to 10^8 is possible with a discrete distribution, but you can see the problem in dealing with this as a discrete random variable. Thus, as an approximation, the gamma (or log-normal or Weibull) is often used as an approximation of the discrete distribution when there are very large counts or the range of counts is very large. The gamma has the desirable property that the variance is a function of the mean, which is one of the properties of typical discrete distributions.
Be careful when using the gamma: it is defined only for values of y larger than 0. If you have any zeros, they become missing values in the analysis (something you probably don't want). There are more general versions of the gamma that allow for 0 values, but these are not available in GENMOD.