I am running some accelerated failure time models using PROC LIFEREG. As part of this, I am using model fit statistics to decide which distribution is appropriate for my data. Specifically, I am looking at the Exponential, Weibull, and Generalized Gamma distributions.
However, I have noticed that my choice of model changes depending on whether or not the NOLOG option is specified in the MODEL statement. That is, if I run code like the following:
PROC LIFEREG data=example;
model time*event(0) = x|y / dist=exponential;
ods select FitStatistics;
run;
PROC LIFEREG data=example;
model time*event(0) = x|y / dist=weibull;
ods select FitStatistics;
run;
PROC LIFEREG data=example;
model time*event(0) = x|y / dist=gamma;
ods select FitStatistics;
run;
Then I find that the Weibull model fits the data best (lowest AIC, AICC, BIC). However, if I add the NOLOG option, and run the code as follows:
PROC LIFEREG data=example;
model time*event(0) = x|y / dist=exponential nolog;
ods select FitStatistics;
run;
PROC LIFEREG data=example;
model time*event(0) = x|y / dist=weibull nolog;
ods select FitStatistics;
run;
PROC LIFEREG data=example;
model time*event(0) = x|y / dist=gamma nolog;
ods select FitStatistics;
run;
Then I find that the Gamma distribution fits the data best by the same criteria. And let me note that this isn't a case where the AICs are all within ~5 of each other one way or the other, the differences are large (on the unlogged scale, Gamma AIC is almost 60 less than Weibull AIC, while on the log scale Gamma AIC is about 20 higher than Weibull AIC).
Similarly, instead of relying on AIC, etc., I can perform likelihood ratio tests, since an exponential AFT model can be viewed as being nested within a Weibull AFT model, and a Weibull AFT model can be viewed as being nested within a Generalized Gamma AFT model (e.g. these course notes, p.118). As with the above, my interpretation changes depending on whether or not I am using the likelihood on the logged or unlogged responses (same pattern: on the log scale, the numbers tell me to choose the Weibull, while on the unlogged scale the numbers tell me to choose the Generalized Gamma).
Unless I am missing something fundamental, I don't understand how these can give me radically different results. A log is a one-to-one transformation, so I don't see how this would have such dramatic impact on the RELATIVE likelihoods/AICs of the three models (that is, I understand it will give me different absolute fit statistic values, but I don't understand why it is changing the nature of the relationship between these fit statistics). Further, per the SAS documentation: "When comparing models, you should compare fit criteria based on the log likelihood that is computed by using the response on the same scale, either always based on the log of the response or always based on the response on the original scale." This seems to imply that it doesn't matter which one you use so long as you are consistent across all models in the comparison; but in my case it does matter, and quite markedly so.
Can anybody help explain why this might be the case? Or which scale I should use for appropriately picking a distribution? The SAS documentation recommends using NOLOG specifically when comparing distributions like the Weibull and the Normal, which compute the likeliood on different scales by default, but otherwise offer no clues as to how to use the NOLOG option in the context of similar distributions like the Weibull and Exponential.
... View more