BookmarkSubscribeRSS Feed
ml
Calcite | Level 5 ml
Calcite | Level 5

I am wondering if guidance/suggestions can be offered about the comparability of results for negative binomial regression from GENMOD and COUNTREG in SAS 9.3.

I am using a Statistics Canada public use microdata file with non-integer weights.

I know that GENMOD uses GLM-style coding for classification variables but is this also true for COUNTREG? (It appears to be so but I have seen no documentation to that effect though an example strongly hints that this is so.)

Resulting parameter estimates look different (signs are same but magnitudes different in the 1st or 2nd decimal place). I am using defaults in both procedures (Newton-Raphson in COUNTREG and I assume a maximum likelihood in GENMOD); both procedures converge. Though I am quite content to accommodate such differences, is there a general preference of one procedure over the other based on the default algorithms and types of data?

More interestingly, COUNTEG has a ‘probcount’ keyword in its output statement of the "probability of the response variable taking particular values."  But no such facility appears in GENMOD. GENMOD can provide the mean of "the predicted probability that the response variable is less than or equal to the value of _LEVEL_ if the multinomial model for ordinal data is used (in other words, Pr, where Y is the response variable)," which is manifestly not the same as the output from COUNTREG. If I wanted to get the equivalent of PROBCOUNT from GENMOD is there a macro that would enable me to do so?



2 REPLIES 2
PGStats
Opal | Level 21

It is likely that the two procedures are reporting different parameterizations of the distribution. In terms of SAS definition of the distribution found here :

http://support.sas.com/documentation/cdl/en/lefunctionsref/63354/HTML/default/viewer.htm#n164yyfgppe...

the expected count is mu=n*p/(1-p) and the variance of the counts is sigma2 = n*p/(1-p)**2 .

When you fit a NEGBIN(P=2) in COUNTREG, the relationship between the counts and their variances is modelled as Sigma2 = mu + Alpha*mu**2. Thus, for an estimated count (mu = exp(X'*Beta)) and value of Alpha, the parameters of the distribution are: n = 1/Alpha, p = 1/(1+Alpha*mu).

But, when you fit a NEGBIN(P=1), the relationship between the counts and their variances is modelled as Sigma2 = mu + Alpha'*mu (that's Alpha prime). Thus, for an estimated count (mu) and value of Alpha', the parameters of the distribution are: n = mu/Alpha', p = 1/(1+Alpha'). (Note: the simple fact that there are two ways to fit a negative binomial in COUNTREG already warns us that we should be more cautious when fitting that distribution.)

I haven't checked the parameterization fitted by proc GENMOD. It uses the same default link (log) as COUNTREG, but I wouldn't expect its parameter k to be the same as COUNTREG's Alpha described above.

I hope this helps more than it confuses...

PG

PG
ml
Calcite | Level 5 ml
Calcite | Level 5

Thanks PG  - the recent SAS book on overdispersion, which had not arrived when I posted this question, also proved quite useful in this regard.

SAS Innovate 2025: Save the Date

 SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!

Save the date!

What is ANOVA?

ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 2 replies
  • 2112 views
  • 3 likes
  • 2 in conversation