turn on suggestions

Auto-suggest helps you quickly narrow down your search results by suggesting possible matches as you type.

Showing results for

Find a Community

- Home
- /
- Analytics
- /
- Stat Procs
- /
- proc mixed: negative AIC values

Topic Options

- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

09-07-2010 04:40 AM

When modelling in proc mixed, I get negative values for my fit statistics, e.g. AIC.

How to interpret?

Which model is best: the more negative the better?

How can this happen?

Thanks for your help.

Best regards,

Bart

How to interpret?

Which model is best: the more negative the better?

How can this happen?

Thanks for your help.

Best regards,

Bart

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

09-07-2010 05:39 PM

AIC is computed as -2LL + 2p where LL is the log-likelihood for the fitted model summed over all observations and p is the number of parameters in the model. The value 2p must be positive, so a negative value for a fit statistic like AIC is due to a negative value for the -2LL part of the equation.

So, how do we get a negative value for the -2LL part of the equation. Consider a standard normal variable X ~ N(0,1). Given X=x, the log-likelihood of the parameters mu=0 and V=1 is

ll = -0.5 * (log(2*pi) + log(V) + (x-0)**2/V)

or

ll = -0.5 * (log(2*pi) + log(1) + (x-0)**2)

Now, suppose that we rescaled x by multiplying by some factor. In particular, let's compute y* = x*M** where M**=(10**i) for i=-5,-4,-3,...,3,4,5. The mean will stay the same, but the variance V** of the observed value will be M****2. When we compute the log likelihood of the mean and V**, we have*

LL* = -0.5 * (log(2*pi) + log(M****2) + (y**-0)**2/(M****2) )*

or

LL* = -0.5 * (log(2*pi) + log(M****2) + (x*M** - 0)**2/(M****2) )*

The variance cancels out in the last added term because (x*M*-0)**2 = (x**2)*(M****2), so that the log likelihood can be written as*

LL* = -0.5 * (log(2*pi) + log(M****2) + x**2)*

The only part of this expression which varies according to the multiplier is -0.5*log(M***2) = -0.5*log(V**). As log(V[i}) decreases, the value of LL** increases. Identically, as log(V**) decreases, the value of -2LL** will decreases.*

Here is an experiment for you to perform. Take your observed data and multiply every observation by 1000. Fit the same model as you did that gave you negative values for the fit statistics. Parameter estimates and their standard errors should all be scaled by the multiplier 1000. Test statistics (F-test and t-test values) should all be the same. (There could be very small differences in any of these statistics due to the problem of trying to perform base 10 arithmetic employing a computer which employs base 2 computations). But the value of -2LL and the fit statistics will all increase. My guess is that they will all be positive at the end of this experiment.

HTH

So, how do we get a negative value for the -2LL part of the equation. Consider a standard normal variable X ~ N(0,1). Given X=x, the log-likelihood of the parameters mu=0 and V=1 is

ll = -0.5 * (log(2*pi) + log(V) + (x-0)**2/V)

or

ll = -0.5 * (log(2*pi) + log(1) + (x-0)**2)

Now, suppose that we rescaled x by multiplying by some factor. In particular, let's compute y

LL

or

LL

The variance cancels out in the last added term because (x*M

LL

The only part of this expression which varies according to the multiplier is -0.5*log(M

Here is an experiment for you to perform. Take your observed data and multiply every observation by 1000. Fit the same model as you did that gave you negative values for the fit statistics. Parameter estimates and their standard errors should all be scaled by the multiplier 1000. Test statistics (F-test and t-test values) should all be the same. (There could be very small differences in any of these statistics due to the problem of trying to perform base 10 arithmetic employing a computer which employs base 2 computations). But the value of -2LL and the fit statistics will all increase. My guess is that they will all be positive at the end of this experiment.

HTH

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

09-07-2010 05:45 PM

This is essentially an identical post to the one above. I had not anticipated that when I subscripted variables with "left bracket"i'right bracket" that this would turn on italics and that the subscripting would not be recognized. This repost uses instead of "left bracket"i"right bracket" so that italics are not produced and so that subscripting is indicated where it is needed.

AIC is computed as -2LL + 2p where LL is the log-likelihood for the fitted model summed over all observations and p is the number of parameters in the model. The value 2p must be positive, so a negative value for a fit statistic like AIC is due to a negative value for the -2LL part of the equation.

So, how do we get a negative value for the -2LL part of the equation. Consider a standard normal variable X ~ N(0,1). Given X=x, the log-likelihood of the parameters mu=0 and V=1 is

ll = -0.5 * (log(2*pi) + log(V) + (x-0)**2/V)

or

ll = -0.5 * (log(2*pi) + log(1) + (x-0)**2)

Now, suppose that we rescaled x by multiplying by some factor. In particular, let's compute y = x*M where M=(10**i) for j=-5,-4,-3,...,3,4,5. The mean will stay the same, but the variance V of the observed value will be M**2. When we compute the log likelihood of the mean and V, we have

LL = -0.5 * (log(2*pi) + log(M**2) + (y-0)**2/(M**2) )

or

LL = -0.5 * (log(2*pi) + log(M**2) + (x*M - 0)**2/(M**2) )

The variance cancels out in the last added term because (x*M-0)**2 = (x**2)*(M**2), so that the log likelihood can be written as

LL = -0.5 * (log(2*pi) + log(M**2) + x**2)

The only part of this expression which varies according to the multiplier is -0.5*log(M**2) = -0.5*log(V). As log(V[j}) decreases, the value of LL increases. Identically, as log(V) decreases, the value of -2LL will decreases.

Here is an experiment for you to perform. Take your observed data and multiply every observation by 1000. Fit the same model as you did that gave you negative values for the fit statistics. Parameter estimates and their standard errors should all be scaled by the multiplier 1000. Test statistics (F-test and t-test values) should all be the same. (There could be very small differences in any of these statistics due to the problem of trying to perform base 10 arithmetic employing a computer which employs base 2 computations). But the value of -2LL and the fit statistics will all increase. My guess is that they will all be positive at the end of this experiment.

HTH

AIC is computed as -2LL + 2p where LL is the log-likelihood for the fitted model summed over all observations and p is the number of parameters in the model. The value 2p must be positive, so a negative value for a fit statistic like AIC is due to a negative value for the -2LL part of the equation.

So, how do we get a negative value for the -2LL part of the equation. Consider a standard normal variable X ~ N(0,1). Given X=x, the log-likelihood of the parameters mu=0 and V=1 is

ll = -0.5 * (log(2*pi) + log(V) + (x-0)**2/V)

or

ll = -0.5 * (log(2*pi) + log(1) + (x-0)**2)

Now, suppose that we rescaled x by multiplying by some factor. In particular, let's compute y

LL

or

LL

The variance cancels out in the last added term because (x*M

LL

The only part of this expression which varies according to the multiplier is -0.5*log(M

Here is an experiment for you to perform. Take your observed data and multiply every observation by 1000. Fit the same model as you did that gave you negative values for the fit statistics. Parameter estimates and their standard errors should all be scaled by the multiplier 1000. Test statistics (F-test and t-test values) should all be the same. (There could be very small differences in any of these statistics due to the problem of trying to perform base 10 arithmetic employing a computer which employs base 2 computations). But the value of -2LL and the fit statistics will all increase. My guess is that they will all be positive at the end of this experiment.

HTH

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

09-08-2010 05:41 PM

Thanks a lot. It works out. So nothing to worry about negative AIC values.

Best regards,

Bart

Best regards,

Bart