turn on suggestions

Auto-suggest helps you quickly narrow down your search results by suggesting possible matches as you type.

Showing results for

Find a Community

- Home
- /
- Analytics
- /
- Stat Procs
- /
- comparing distributions likelihood ratio

Topic Options

- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

09-15-2015 09:32 AM

Hello,

Is it possible to assess which distribution fits better using a likelihood ratio? For example, if you want to assess if the model better fits under a negative binomial distribution than a poisson distribution, can you use the log likelihoods under the 'fit statistics' section of the output to perform a test such as explained under http://support.sas.com/kb/24/474.html

In the example of the internet site it shows the difference in df between the 2 different models where in 1 model some variables are removed and thus results in a difference of df. But what should you specify when you want to compare the fit of 2 distributions? The df then stay almost the same (in several things I tried there is a difference of 0.1 or less).

Thanks in advance for the help.

Accepted Solutions

Solution

09-25-2015
06:23 AM

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

09-16-2015 09:26 AM

All Replies

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

09-15-2015 10:12 AM

This does not directly answer your question, but you might find it helpful to read the documentation for teh SEVERITY procedure in SAS/ETS software. The SEVERITY procedure fits multiple models to data and provides statistics that you can assess to determine which model you want to use. It provides several likelihood statistics (-2LL, AIC, AICC, BIC) as well as ECDF statistics. It also provides graphical diagnostic plots to accompany the statistics.

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

09-15-2015 10:44 AM

You can conduct a LR test based on log-likelihoods if the two distributions are nested (i.e., if one is a special case of the other). For instance, the Poisson is a special case of the negative binomial (as 1/k =0, negative binomial = Poisson). In this example, the negative binomial has one more parameter than the Poisson (many sources use k as the overdispersion parameter of the negative binomial, but sas uses scale = 1/k in several procedures). The df for the LR is 1 because of the difference of parameters. LR is -2 times the difference in log-likelihoods. Under the null hypothesis (H0: distribution is the simpler one), the test statistic nominally has a chi-squared distribution. Caution: when the scale parameter is on boundary in order to get the simpler distribution, then the the test statistic may have a more complex distribution than a simple chi-squared (with 1 df). For instance, scale parameter ranges from 0 to infinity, and scale=0 gives you the simpler distribution. Thus, the more complex test statistic distribution. Many ignore this issue.

Be careful with different procedures. Some programs may not give the the actual log-likelihood. For instance, many log-likelihoods can be written as sum of terms, where some terms invovle parameters and data, and some terms involve only the data (not the parameters). To be computationally efficient, the term not involving parameters may not be calculated or displayed. This is fine when one is comparing log-likelihoods all for the same distribution (with the same procedure), but could cause trouble if you are comparing distributions.

Be careful with different procdures. If you use GLIMMIX (say, with different choices of distributions), make sure you are not using one of the conditional log-likelihood methods (rspl, mspl, ...). You need to be using the actual log-likelihood (method=quad).

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

09-16-2015 03:01 AM

Just a quick question. I understand that there is an extra parameter with the negative binomial and therefore should be df=1. Why is this not reflected in the Pearson Chi-square/DF statistic in the 'fit statistics for conditional distribution' section. The section mentions Pearson Chi-square and the result of the Pearson Chi-square/DF so I should be able to calculate the df. Or is this referring to a different df?

Solution

09-25-2015
06:23 AM

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

09-16-2015 09:26 AM

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

09-17-2015 02:53 AM

Ah, thanks, that clarifies it. What if you have repeated measurements (r side variance). Sas documentation states that this is not supported for method=quad. Is there a way around that in glimmix?

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

09-17-2015 08:50 AM

You have to use G-side covariance structure for the repeated measure with-normal distributions, when you use quadrature of Laplace estimation methods. The book by Walt Stroup on GLMMs is excellent on this topic (with lots of SAS code available on-line).

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

09-17-2015 10:23 AM

Thanks for all the help. It is clarified now. Will look at your book suggestion.

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

09-22-2015 08:13 AM

The idea of testing for a better fit for a distribution is intriguing, but sounds like a lot of work when comparison of information criteria ought to do the trick on its own. So long as the data, model and any random statements are the same, and the same link is used (and appropriate) for both distributions, AIC provides an excellent choice for distribution selection, in my experience.

Steve Denham