<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic PROC LIFEREG fit statistics in Statistical Procedures</title>
    <link>https://communities.sas.com/t5/Statistical-Procedures/PROC-LIFEREG-fit-statistics/m-p/234092#M12365</link>
    <description>&lt;P&gt;I am running some accelerated failure time models using PROC LIFEREG. As part of this, I am using model fit statistics to decide which distribution is appropriate for my data. Specifically, I am looking at the Exponential, Weibull, and Generalized Gamma distributions.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;However, I have noticed that my choice of model changes depending on whether or not the NOLOG option is specified in the MODEL statement. That is, if I run code like the following:&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;PROC LIFEREG data=example;
     model time*event(0) = x|y / dist=exponential;
     ods select FitStatistics;
run;

PROC LIFEREG data=example;
     model time*event(0) = x|y / dist=weibull;
     ods select FitStatistics;
run;

PROC LIFEREG data=example;
     model time*event(0) = x|y / dist=gamma;
     ods select FitStatistics;
run;&lt;/CODE&gt;&lt;/PRE&gt;
&lt;P&gt;Then I find that the Weibull model fits the data best (lowest AIC, AICC, BIC). However, if I add the NOLOG option, and run the code as follows:&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;PROC LIFEREG data=example;
     model time*event(0) = x|y / dist=exponential nolog;
     ods select FitStatistics;
run;

PROC LIFEREG data=example;
     model time*event(0) = x|y / dist=weibull nolog;
     ods select FitStatistics;
run;

PROC LIFEREG data=example;
     model time*event(0) = x|y / dist=gamma nolog;
     ods select FitStatistics;
run;&lt;/CODE&gt;&lt;/PRE&gt;
&lt;P&gt;Then I find that the Gamma distribution fits the data best by the same criteria. And let me note that this isn't a case where the AICs are all within ~5 of each other one way or the other, the differences are large (on the unlogged scale, Gamma AIC is almost 60 less than Weibull AIC, while on the log scale Gamma AIC is about 20 higher than Weibull AIC).&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Similarly, instead of relying on AIC, etc., I can perform likelihood ratio tests, since an exponential AFT model can be viewed as being nested within a Weibull AFT model, and a Weibull AFT model can be viewed as being nested within a Generalized Gamma AFT model (e.g. &lt;A href="http://www4.stat.ncsu.edu/~dzhang2/st745/chap5.pdf" target="_self"&gt;these course notes,&lt;/A&gt; p.118). As with the above, my interpretation changes depending on whether or not I am using the likelihood on the logged or unlogged responses (same pattern: on the log scale, the numbers tell me to choose the Weibull, while on the unlogged scale the numbers tell me to choose the Generalized Gamma).&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Unless I am missing something fundamental, I don't understand how these can give me radically different results. A log is a one-to-one transformation, so I don't see how this would have such dramatic impact on the RELATIVE likelihoods/AICs of the three models (that is, I understand it will give me different absolute fit statistic values, but I don't understand why it is changing the nature of the relationship between these fit statistics). Further, per the SAS documentation:&lt;FONT face="courier new,courier"&gt;&lt;EM&gt; "When comparing models, you should compare fit criteria based on the log likelihood that is computed by using the response on the same scale, either always based on the log of the response or always based on the response on the original scale."&lt;/EM&gt;&lt;/FONT&gt; This seems to imply that it doesn't matter which one you use so long as you are consistent across all models in the comparison; but in my case it does matter, and quite markedly so.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Can anybody help explain why this might be the case? Or which scale I should use for appropriately picking a distribution? The SAS documentation recommends using NOLOG specifically when comparing distributions like the Weibull and the Normal, which compute the likeliood on different scales by default, but otherwise offer no clues as to how to use the NOLOG option in the context of similar distributions like the Weibull and Exponential.&lt;/P&gt;</description>
    <pubDate>Tue, 10 Nov 2015 18:53:33 GMT</pubDate>
    <dc:creator>RyanSimmons</dc:creator>
    <dc:date>2015-11-10T18:53:33Z</dc:date>
    <item>
      <title>PROC LIFEREG fit statistics</title>
      <link>https://communities.sas.com/t5/Statistical-Procedures/PROC-LIFEREG-fit-statistics/m-p/234092#M12365</link>
      <description>&lt;P&gt;I am running some accelerated failure time models using PROC LIFEREG. As part of this, I am using model fit statistics to decide which distribution is appropriate for my data. Specifically, I am looking at the Exponential, Weibull, and Generalized Gamma distributions.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;However, I have noticed that my choice of model changes depending on whether or not the NOLOG option is specified in the MODEL statement. That is, if I run code like the following:&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;PROC LIFEREG data=example;
     model time*event(0) = x|y / dist=exponential;
     ods select FitStatistics;
run;

PROC LIFEREG data=example;
     model time*event(0) = x|y / dist=weibull;
     ods select FitStatistics;
run;

PROC LIFEREG data=example;
     model time*event(0) = x|y / dist=gamma;
     ods select FitStatistics;
run;&lt;/CODE&gt;&lt;/PRE&gt;
&lt;P&gt;Then I find that the Weibull model fits the data best (lowest AIC, AICC, BIC). However, if I add the NOLOG option, and run the code as follows:&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;PROC LIFEREG data=example;
     model time*event(0) = x|y / dist=exponential nolog;
     ods select FitStatistics;
run;

PROC LIFEREG data=example;
     model time*event(0) = x|y / dist=weibull nolog;
     ods select FitStatistics;
run;

PROC LIFEREG data=example;
     model time*event(0) = x|y / dist=gamma nolog;
     ods select FitStatistics;
run;&lt;/CODE&gt;&lt;/PRE&gt;
&lt;P&gt;Then I find that the Gamma distribution fits the data best by the same criteria. And let me note that this isn't a case where the AICs are all within ~5 of each other one way or the other, the differences are large (on the unlogged scale, Gamma AIC is almost 60 less than Weibull AIC, while on the log scale Gamma AIC is about 20 higher than Weibull AIC).&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Similarly, instead of relying on AIC, etc., I can perform likelihood ratio tests, since an exponential AFT model can be viewed as being nested within a Weibull AFT model, and a Weibull AFT model can be viewed as being nested within a Generalized Gamma AFT model (e.g. &lt;A href="http://www4.stat.ncsu.edu/~dzhang2/st745/chap5.pdf" target="_self"&gt;these course notes,&lt;/A&gt; p.118). As with the above, my interpretation changes depending on whether or not I am using the likelihood on the logged or unlogged responses (same pattern: on the log scale, the numbers tell me to choose the Weibull, while on the unlogged scale the numbers tell me to choose the Generalized Gamma).&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Unless I am missing something fundamental, I don't understand how these can give me radically different results. A log is a one-to-one transformation, so I don't see how this would have such dramatic impact on the RELATIVE likelihoods/AICs of the three models (that is, I understand it will give me different absolute fit statistic values, but I don't understand why it is changing the nature of the relationship between these fit statistics). Further, per the SAS documentation:&lt;FONT face="courier new,courier"&gt;&lt;EM&gt; "When comparing models, you should compare fit criteria based on the log likelihood that is computed by using the response on the same scale, either always based on the log of the response or always based on the response on the original scale."&lt;/EM&gt;&lt;/FONT&gt; This seems to imply that it doesn't matter which one you use so long as you are consistent across all models in the comparison; but in my case it does matter, and quite markedly so.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Can anybody help explain why this might be the case? Or which scale I should use for appropriately picking a distribution? The SAS documentation recommends using NOLOG specifically when comparing distributions like the Weibull and the Normal, which compute the likeliood on different scales by default, but otherwise offer no clues as to how to use the NOLOG option in the context of similar distributions like the Weibull and Exponential.&lt;/P&gt;</description>
      <pubDate>Tue, 10 Nov 2015 18:53:33 GMT</pubDate>
      <guid>https://communities.sas.com/t5/Statistical-Procedures/PROC-LIFEREG-fit-statistics/m-p/234092#M12365</guid>
      <dc:creator>RyanSimmons</dc:creator>
      <dc:date>2015-11-10T18:53:33Z</dc:date>
    </item>
    <item>
      <title>Re: PROC LIFEREG fit statistics</title>
      <link>https://communities.sas.com/t5/Statistical-Procedures/PROC-LIFEREG-fit-statistics/m-p/235455#M12460</link>
      <description>&lt;P&gt;Hello,&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;If you specify NOLOG you are in fact investigating the appropriateness of a DIFFERENT distribution.&lt;/P&gt;
&lt;P align="LEFT"&gt;LNORMAL NOLOG is no longer referring to a LogNormal distribution but to a Normal distribution.&lt;/P&gt;
&lt;P align="LEFT"&gt;5 distributions WITH and WITHOUT NOLOG&amp;nbsp;generate ten(!) different parametric models.&lt;/P&gt;
&lt;P align="LEFT"&gt;&amp;nbsp;&lt;/P&gt;
&lt;P align="LEFT"&gt;See for example this NESUG 18 poster/paper&lt;/P&gt;
&lt;P align="LEFT"&gt;Predictive Modeling Using Survival Analysis&lt;/P&gt;
&lt;P&gt;Vadim Pliner, Verizon Wireless, Orangeburg, NY&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;On page 2 we read ...&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P align="LEFT"&gt;PARAMETRIC REGRESSION MODELS&lt;/P&gt;
&lt;P align="LEFT"&gt;In survival analysis, the parametric regression models have this form:&lt;/P&gt;
&lt;P align="LEFT"&gt;&lt;FONT face="TimesNewRoman"&gt;Y = &lt;/FONT&gt;&lt;FONT face="TimesNewRoman"&gt;β&lt;/FONT&gt;&lt;FONT face="TimesNewRoman" size="1"&gt;&lt;FONT face="TimesNewRoman" size="1"&gt;0 &lt;/FONT&gt;&lt;/FONT&gt;&lt;FONT face="TimesNewRoman"&gt;+ &lt;/FONT&gt;&lt;FONT face="TimesNewRoman"&gt;Σ β&lt;/FONT&gt;&lt;FONT face="TimesNewRoman" size="1"&gt;&lt;FONT face="TimesNewRoman" size="1"&gt;j&lt;/FONT&gt;&lt;/FONT&gt;&lt;FONT face="TimesNewRoman"&gt;x&lt;/FONT&gt;&lt;FONT face="TimesNewRoman" size="1"&gt;&lt;FONT face="TimesNewRoman" size="1"&gt;j &lt;/FONT&gt;&lt;/FONT&gt;&lt;FONT face="TimesNewRoman"&gt;+ &lt;/FONT&gt;&lt;FONT face="TimesNewRoman"&gt;σε&lt;/FONT&gt;&lt;FONT face="TimesNewRoman"&gt;,&lt;/FONT&gt;&lt;/P&gt;
&lt;P align="LEFT"&gt;&lt;FONT face="TimesNewRoman"&gt;where Y is either T (survival/failure time) or log(T), x&lt;/FONT&gt;&lt;FONT face="TimesNewRoman" size="1"&gt;&lt;FONT face="TimesNewRoman" size="1"&gt;j &lt;/FONT&gt;&lt;/FONT&gt;&lt;FONT face="TimesNewRoman"&gt;are covariates, &lt;/FONT&gt;&lt;FONT face="TimesNewRoman"&gt;ε &lt;/FONT&gt;&lt;FONT face="TimesNewRoman"&gt;is a random error,&lt;/FONT&gt;&lt;/P&gt;
&lt;P align="LEFT"&gt;&lt;FONT face="TimesNewRoman"&gt;and &lt;/FONT&gt;&lt;FONT face="TimesNewRoman"&gt;β&lt;/FONT&gt;&lt;FONT face="TimesNewRoman" size="1"&gt;&lt;FONT face="TimesNewRoman" size="1"&gt;j &lt;/FONT&gt;&lt;/FONT&gt;&lt;FONT face="TimesNewRoman"&gt;and &lt;/FONT&gt;&lt;FONT face="TimesNewRoman"&gt;σ &lt;/FONT&gt;&lt;FONT face="TimesNewRoman"&gt;are parameters to be estimated. In SAS, the maximum likelihood estimators&lt;/FONT&gt;&lt;/P&gt;
&lt;P align="LEFT"&gt;of the parameters can be calculated using PROC LIFEREG if one of the following classes&lt;/P&gt;
&lt;P align="LEFT"&gt;of survival distribution functions of T is specified (option dist= or d= on the MODEL&lt;/P&gt;
&lt;P align="LEFT"&gt;statement): exponential (d=EXPONENTIAL), Weibull (d=WEIBULL), log-logistic&lt;/P&gt;
&lt;P align="LEFT"&gt;(d=LLOGISTIC), log-normal (d=LNORMAL), generalized gamma (d=GAMMA),&lt;/P&gt;
&lt;P align="LEFT"&gt;logistic (d=LOGISTIC), and normal (d=NORMAL). By default, PROC LIFEREG&lt;/P&gt;
&lt;P align="LEFT"&gt;models Y=log(T) when the first five models are specified, which leads to so called&lt;/P&gt;
&lt;P align="LEFT"&gt;accelerated failure time models. One can suppress the log transformation with the&lt;/P&gt;
&lt;P align="LEFT"&gt;NOLOG option. When the Exponential or Weibull options are specified, adding NOLOG&lt;/P&gt;
&lt;P align="LEFT"&gt;results in the extreme value distribution with one and two parameters, respectively.&lt;/P&gt;
&lt;P align="LEFT"&gt;d=gamma in combination with the NOLOG option means the log-gamma distribution of&lt;/P&gt;
&lt;P align="LEFT"&gt;T. Specifying d=LNORMAL NOLOG is equivalent to just d=NORMAL (without&lt;/P&gt;
&lt;P align="LEFT"&gt;NOLOG). Similarly, d=LLOGISTIC NOLOG leads to the same model as d=LOGISTIC&lt;/P&gt;
&lt;P align="LEFT"&gt;(without NOLOG). And NOLOG has no effect on either d=NORMAL or d=LOGISTIC.&lt;/P&gt;
&lt;P align="LEFT"&gt;&amp;nbsp;&lt;/P&gt;
&lt;P align="LEFT"&gt;Overall, all combinations of values of the two options (d= with or without NOLOG)&lt;/P&gt;
&lt;P align="LEFT"&gt;generate ten different parametric models. To select the best one, two approaches are&lt;/P&gt;
&lt;P align="LEFT"&gt;described below. They are both based on the value of maximized log likelihood, which is&lt;/P&gt;
&lt;P&gt;computed by PROC LIFEREG.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Kind regards,&lt;/P&gt;
&lt;P&gt;Koen&lt;/P&gt;
&lt;P align="LEFT"&gt;&amp;nbsp;&lt;/P&gt;
&lt;P align="LEFT"&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Thu, 19 Nov 2015 13:53:45 GMT</pubDate>
      <guid>https://communities.sas.com/t5/Statistical-Procedures/PROC-LIFEREG-fit-statistics/m-p/235455#M12460</guid>
      <dc:creator>sbxkoenk</dc:creator>
      <dc:date>2015-11-19T13:53:45Z</dc:date>
    </item>
    <item>
      <title>Re: PROC LIFEREG fit statistics</title>
      <link>https://communities.sas.com/t5/Statistical-Procedures/PROC-LIFEREG-fit-statistics/m-p/235719#M12474</link>
      <description>&lt;P&gt;Hello Ryan,&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Thanks for marking my answer as&amp;nbsp;the solution.&lt;/P&gt;
&lt;P&gt;To be complete, I just add the URL of the paper I was citing:&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;NESUG 18 (North East SAS Users Group)&lt;/P&gt;
&lt;P align="LEFT"&gt;Predictive Modeling Using Survival Analysis&lt;/P&gt;
&lt;P&gt;Vadim Pliner, Verizon Wireless, Orangeburg, NY&lt;/P&gt;
&lt;P&gt;&lt;A href="http://www.lexjansen.com/nesug/nesug05/pos/pos6.pdf" target="_blank"&gt;http://www.lexjansen.com/nesug/nesug05/pos/pos6.pdf&lt;/A&gt;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Cheers,&lt;/P&gt;
&lt;P&gt;Koen&lt;/P&gt;</description>
      <pubDate>Fri, 20 Nov 2015 16:27:49 GMT</pubDate>
      <guid>https://communities.sas.com/t5/Statistical-Procedures/PROC-LIFEREG-fit-statistics/m-p/235719#M12474</guid>
      <dc:creator>sbxkoenk</dc:creator>
      <dc:date>2015-11-20T16:27:49Z</dc:date>
    </item>
  </channel>
</rss>

