turn on suggestions

Auto-suggest helps you quickly narrow down your search results by suggesting possible matches as you type.

Showing results for

Find a Community

- Home
- /
- Analytics
- /
- Stat Procs
- /
- Aggregate option changes Full Log Likelihood in PR...

Topic Options

- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

11-16-2009 01:02 PM

I've just upgraded from SAS v9.1 to v9.2, and noticed the nice addition of AIC and other information criteria in the GENMOD procedure Goodness-of-fit output. However, I also noticed that the Deviance and Pearson Chi-square are not printed by default, but must be requested by the AGGREGATE option to the MODEL statement.

The trouble is, for some reason that I don't quite understand, that when I leave the AGGREGATE option out, the Full Log Likelihood statistic becomes equal to the Log Likelihood, but when I give the AGGREGATE option the Full Log Likelihood statistic changes -- and consequently the AIC which is calculated from this (identical output in all other respects).

As an example, the Goodness-of-fit output from two identical models (dist=bin,link=logit), save for the aggregate option. Note in particular the change in Full Log Likelihood.

Without AGGREGATE option:

Criterion DF Value Value/DF

Log Likelihood -383.3357

Full Log Likelihood -383.3357

AIC (smaller is better) 770.6714

AICC (smaller is better) 770.6818

BIC (smaller is better) 780.7907

With AGGREGATE option:

Criterion DF Value Value/DF

Deviance 1082 722.9896 0.6682

Scaled Deviance 1082 722.9896 0.6682

Pearson Chi-Square 1082 1067.9233 0.9870

Scaled Pearson X2 1082 1067.9233 0.9870

Log Likelihood -383.3357

Full Log Likelihood -372.1276

AIC (smaller is better) 748.2552

AICC (smaller is better) 748.2655

BIC (smaller is better) 758.3744

All other output is identical. Should these values not be the same?

The trouble is, for some reason that I don't quite understand, that when I leave the AGGREGATE option out, the Full Log Likelihood statistic becomes equal to the Log Likelihood, but when I give the AGGREGATE option the Full Log Likelihood statistic changes -- and consequently the AIC which is calculated from this (identical output in all other respects).

As an example, the Goodness-of-fit output from two identical models (dist=bin,link=logit), save for the aggregate option. Note in particular the change in Full Log Likelihood.

Without AGGREGATE option:

Criterion DF Value Value/DF

Log Likelihood -383.3357

Full Log Likelihood -383.3357

AIC (smaller is better) 770.6714

AICC (smaller is better) 770.6818

BIC (smaller is better) 780.7907

With AGGREGATE option:

Criterion DF Value Value/DF

Deviance 1082 722.9896 0.6682

Scaled Deviance 1082 722.9896 0.6682

Pearson Chi-Square 1082 1067.9233 0.9870

Scaled Pearson X2 1082 1067.9233 0.9870

Log Likelihood -383.3357

Full Log Likelihood -372.1276

AIC (smaller is better) 748.2552

AICC (smaller is better) 748.2655

BIC (smaller is better) 758.3744

All other output is identical. Should these values not be the same?

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

11-24-2009 10:41 AM

The full log likelihood includes the combinatorial (n-choose-r) that is omitted from the log likelihood. The value of the combinatorial depends on how the populations are defined and that is what the AGGREGATE= option does. If no aggregation is done, each observation is treated as a separate population of size 1.

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

02-26-2010 10:31 AM

Hi again,

I haven't replied to this in quite a long time, but I just got a similar issue with GENMOD that I want to ask about. Thanks for the reply, Dave.

But first things first: I still don't quite follow how I would go about in order to request the Deviance and Pearson's Chi-square statistics that was printed by default in earlier releases but are not in v9.2, i.e. the goodness-of-fit statistics that compare the fitted model versus the full perfect-fit model (the model with one parameter per observation). I take it this should be what I get when I specify the AGGREGATE option without a variable list?

The new issue that I have noticed is that specifying the AGGREGATE option can lead to problems of non-convergence. How come? Does this option affect the model fitting algorithm?

I haven't replied to this in quite a long time, but I just got a similar issue with GENMOD that I want to ask about. Thanks for the reply, Dave.

But first things first: I still don't quite follow how I would go about in order to request the Deviance and Pearson's Chi-square statistics that was printed by default in earlier releases but are not in v9.2, i.e. the goodness-of-fit statistics that compare the fitted model versus the full perfect-fit model (the model with one parameter per observation). I take it this should be what I get when I specify the AGGREGATE option without a variable list?

The new issue that I have noticed is that specifying the AGGREGATE option can lead to problems of non-convergence. How come? Does this option affect the model fitting algorithm?