11-16-2009 01:02 PM

I've just upgraded from SAS v9.1 to v9.2, and noticed the nice addition of AIC and other information criteria in the GENMOD procedure Goodness-of-fit output. However, I also noticed that the Deviance and Pearson Chi-square are not printed by default, but must be requested by the AGGREGATE option to the MODEL statement.

The trouble is, for some reason that I don't quite understand, that when I leave the AGGREGATE option out, the Full Log Likelihood statistic becomes equal to the Log Likelihood, but when I give the AGGREGATE option the Full Log Likelihood statistic changes -- and consequently the AIC which is calculated from this (identical output in all other respects).

As an example, the Goodness-of-fit output from two identical models (dist=bin,link=logit), save for the aggregate option. Note in particular the change in Full Log Likelihood.

Without AGGREGATE option:

Criterion DF Value Value/DF

Log Likelihood -383.3357

Full Log Likelihood -383.3357

AIC (smaller is better) 770.6714

AICC (smaller is better) 770.6818

BIC (smaller is better) 780.7907

With AGGREGATE option:

Criterion DF Value Value/DF

Deviance 1082 722.9896 0.6682

Scaled Deviance 1082 722.9896 0.6682

Pearson Chi-Square 1082 1067.9233 0.9870

Scaled Pearson X2 1082 1067.9233 0.9870

Log Likelihood -383.3357

Full Log Likelihood -372.1276

AIC (smaller is better) 748.2552

AICC (smaller is better) 748.2655

BIC (smaller is better) 758.3744

All other output is identical. Should these values not be the same?

11-24-2009 10:41 AM

The full log likelihood includes the combinatorial (n-choose-r) that is omitted from the log likelihood. The value of the combinatorial depends on how the populations are defined and that is what the AGGREGATE= option does. If no aggregation is done, each observation is treated as a separate population of size 1.

Posted in reply to StatDave_sas

02-26-2010 10:31 AM

Hi again,

I haven't replied to this in quite a long time, but I just got a similar issue with GENMOD that I want to ask about. Thanks for the reply, Dave.

But first things first: I still don't quite follow how I would go about in order to request the Deviance and Pearson's Chi-square statistics that was printed by default in earlier releases but are not in v9.2, i.e. the goodness-of-fit statistics that compare the fitted model versus the full perfect-fit model (the model with one parameter per observation). I take it this should be what I get when I specify the AGGREGATE option without a variable list?

The new issue that I have noticed is that specifying the AGGREGATE option can lead to problems of non-convergence. How come? Does this option affect the model fitting algorithm?

I haven't replied to this in quite a long time, but I just got a similar issue with GENMOD that I want to ask about. Thanks for the reply, Dave.

But first things first: I still don't quite follow how I would go about in order to request the Deviance and Pearson's Chi-square statistics that was printed by default in earlier releases but are not in v9.2, i.e. the goodness-of-fit statistics that compare the fitted model versus the full perfect-fit model (the model with one parameter per observation). I take it this should be what I get when I specify the AGGREGATE option without a variable list?

The new issue that I have noticed is that specifying the AGGREGATE option can lead to problems of non-convergence. How come? Does this option affect the model fitting algorithm?