Statistical Procedures

DDK · Posted 09-15-2015 09:32 AM

Hello,

Is it possible to assess which distribution fits better using a likelihood ratio? For example, if you want to assess if the model better fits under a negative binomial distribution than a poisson distribution, can you use the log likelihoods under the 'fit statistics' section of the output to perform a test such as explained under http://support.sas.com/kb/24/474.html

In the example of the internet site it shows the difference in df between the 2 different models where in 1 model some variables are removed and thus results in a difference of df. But what should you specify when you want to compare the fit of 2 distributions? The df then stay almost the same (in several things I tried there is a difference of 0.1 or less).

Thanks in advance for the help.

lvm · Posted 09-16-2015 09:26 AM

I didn't notice in your original post, but it looks like you are using GLIMMIX. The df used in the PearsonChiSq/df calculation does not involve the scale parameter. The 0.1 or so difference you noticed in the df calculation is just rounding. The proc (also GENMOD) uses the same df for Poisson and NB. The LR test to compare distributions has to be done by hand (or in a data step using ODS output), using df=1. Use -2LL from two runs of the procedure.

View solution in original post

Rick_SAS · Posted 09-15-2015 10:12 AM

This does not directly answer your question, but you might find it helpful to read the documentation for teh SEVERITY procedure in SAS/ETS software. The SEVERITY procedure fits multiple models to data and provides statistics that you can assess to determine which model you want to use. It provides several likelihood statistics (-2LL, AIC, AICC, BIC) as well as ECDF statistics. It also provides graphical diagnostic plots to accompany the statistics.

lvm · Posted 09-15-2015 10:44 AM

You can conduct a LR test based on log-likelihoods if the two distributions are nested (i.e., if one is a special case of the other). For instance, the Poisson is a special case of the negative binomial (as 1/k =0, negative binomial = Poisson). In this example, the negative binomial has one more parameter than the Poisson (many sources use k as the overdispersion parameter of the negative binomial, but sas uses scale = 1/k in several procedures). The df for the LR is 1 because of the difference of parameters. LR is -2 times the difference in log-likelihoods. Under the null hypothesis (H0: distribution is the simpler one), the test statistic nominally has a chi-squared distribution. Caution: when the scale parameter is on boundary in order to get the simpler distribution, then the the test statistic may have a more complex distribution than a simple chi-squared (with 1 df). For instance, scale parameter ranges from 0 to infinity, and scale=0 gives you the simpler distribution. Thus, the more complex test statistic distribution. Many ignore this issue.

Be careful with different procedures. Some programs may not give the the actual log-likelihood. For instance, many log-likelihoods can be written as sum of terms, where some terms invovle parameters and data, and some terms involve only the data (not the parameters). To be computationally efficient, the term not involving parameters may not be calculated or displayed. This is fine when one is comparing log-likelihoods all for the same distribution (with the same procedure), but could cause trouble if you are comparing distributions.

Be careful with different procdures. If you use GLIMMIX (say, with different choices of distributions), make sure you are not using one of the conditional log-likelihood methods (rspl, mspl, ...). You need to be using the actual log-likelihood (method=quad).

DDK · Posted 09-16-2015 03:01 AM

Just a quick question. I understand that there is an extra parameter with the negative binomial and therefore should be df=1. Why is this not reflected in the Pearson Chi-square/DF statistic in the 'fit statistics for conditional distribution' section. The section mentions Pearson Chi-square and the result of the Pearson Chi-square/DF so I should be able to calculate the df. Or is this referring to a different df?

lvm · Posted 09-16-2015 09:26 AM

I didn't notice in your original post, but it looks like you are using GLIMMIX. The df used in the PearsonChiSq/df calculation does not involve the scale parameter. The 0.1 or so difference you noticed in the df calculation is just rounding. The proc (also GENMOD) uses the same df for Poisson and NB. The LR test to compare distributions has to be done by hand (or in a data step using ODS output), using df=1. Use -2LL from two runs of the procedure.

DDK · Posted 09-17-2015 02:53 AM

Ah, thanks, that clarifies it. What if you have repeated measurements (r side variance). Sas documentation states that this is not supported for method=quad. Is there a way around that in glimmix?

lvm · Posted 09-17-2015 08:50 AM

You have to use G-side covariance structure for the repeated measure with-normal distributions, when you use quadrature of Laplace estimation methods. The book by Walt Stroup on GLMMs is excellent on this topic (with lots of SAS code available on-line).

DDK · Posted 09-17-2015 10:23 AM

Thanks for all the help. It is clarified now. Will look at your book suggestion.

SteveDenham · Posted 09-22-2015 08:13 AM

The idea of testing for a better fit for a distribution is intriguing, but sounds like a lot of work when comparison of information criteria ought to do the trick on its own. So long as the data, model and any random statements are the same, and the same link is used (and appropriate) for both distributions, AIC provides an excellent choice for distribution selection, in my experience.

Steve Denham

Statistical Procedures

comparing distributions likelihood ratio

Re: comparing distributions likelihood ratio

Re: comparing distributions likelihood ratio

Re: comparing distributions likelihood ratio

Re: comparing distributions likelihood ratio

Re: comparing distributions likelihood ratio

Re: comparing distributions likelihood ratio

Re: comparing distributions likelihood ratio

Re: comparing distributions likelihood ratio

Re: comparing distributions likelihood ratio

Likelihood Ratio Test for ARIMA Models

likelihood ratio test in mixed model

Appropriate model for non-normal distribution

likelihood ratio test for proc GLM with all category variables

Odds Ratio Interpretation

Follow Us

What is...

Statistical Procedures

Our biggest data and AI event of the year.

Follow Us

What is...