BookmarkSubscribeRSS Feed
T
Calcite | Level 5 T
Calcite | Level 5
I've just upgraded from SAS v9.1 to v9.2, and noticed the nice addition of AIC and other information criteria in the GENMOD procedure Goodness-of-fit output. However, I also noticed that the Deviance and Pearson Chi-square are not printed by default, but must be requested by the AGGREGATE option to the MODEL statement.

The trouble is, for some reason that I don't quite understand, that when I leave the AGGREGATE option out, the Full Log Likelihood statistic becomes equal to the Log Likelihood, but when I give the AGGREGATE option the Full Log Likelihood statistic changes -- and consequently the AIC which is calculated from this (identical output in all other respects).

As an example, the Goodness-of-fit output from two identical models (dist=bin,link=logit), save for the aggregate option. Note in particular the change in Full Log Likelihood.

Without AGGREGATE option:
Criterion DF Value Value/DF
Log Likelihood -383.3357
Full Log Likelihood -383.3357
AIC (smaller is better) 770.6714
AICC (smaller is better) 770.6818
BIC (smaller is better) 780.7907

With AGGREGATE option:
Criterion DF Value Value/DF
Deviance 1082 722.9896 0.6682
Scaled Deviance 1082 722.9896 0.6682
Pearson Chi-Square 1082 1067.9233 0.9870
Scaled Pearson X2 1082 1067.9233 0.9870
Log Likelihood -383.3357
Full Log Likelihood -372.1276
AIC (smaller is better) 748.2552
AICC (smaller is better) 748.2655
BIC (smaller is better) 758.3744


All other output is identical. Should these values not be the same?
2 REPLIES 2
StatDave
SAS Super FREQ
The full log likelihood includes the combinatorial (n-choose-r) that is omitted from the log likelihood. The value of the combinatorial depends on how the populations are defined and that is what the AGGREGATE= option does. If no aggregation is done, each observation is treated as a separate population of size 1.
T
Calcite | Level 5 T
Calcite | Level 5
Hi again,

I haven't replied to this in quite a long time, but I just got a similar issue with GENMOD that I want to ask about. Thanks for the reply, Dave.

But first things first: I still don't quite follow how I would go about in order to request the Deviance and Pearson's Chi-square statistics that was printed by default in earlier releases but are not in v9.2, i.e. the goodness-of-fit statistics that compare the fitted model versus the full perfect-fit model (the model with one parameter per observation). I take it this should be what I get when I specify the AGGREGATE option without a variable list?

The new issue that I have noticed is that specifying the AGGREGATE option can lead to problems of non-convergence. How come? Does this option affect the model fitting algorithm?

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

What is ANOVA?

ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 2 replies
  • 1511 views
  • 0 likes
  • 2 in conversation