Programming the statistical procedures from SAS

comparing distributions likelihood ratio

Accepted Solution Solved
Reply
Occasional Contributor DDK
Occasional Contributor
Posts: 16
Accepted Solution

comparing distributions likelihood ratio

Hello,

 

Is it possible to assess which distribution fits better using a likelihood ratio? For example, if you want to assess if the model better fits under a negative binomial distribution than a poisson distribution, can you use the log likelihoods under the 'fit statistics' section of the output to perform a test such as explained under http://support.sas.com/kb/24/474.html   

 

In the example of the internet site it shows the difference in df between the 2 different models where in 1 model some variables are removed and thus results in a difference of df. But what should you specify when you want to compare the fit of 2 distributions? The df then stay almost the same (in several things I tried there is a difference of 0.1 or less).

 

Thanks in advance for the help.


Accepted Solutions
Solution
‎09-25-2015 06:23 AM
Valued Guide
Valued Guide
Posts: 684

Re: comparing distributions likelihood ratio

I didn't notice in your original post, but it looks like you are using GLIMMIX.  The df used in the PearsonChiSq/df calculation does not involve the scale parameter. The 0.1 or so difference you noticed in the df calculation is just rounding. The proc (also GENMOD) uses the same df for Poisson and NB. The LR test to compare distributions has to be done by hand (or in a data step using ODS output), using df=1. Use -2LL from two runs of the procedure.

View solution in original post


All Replies
SAS Super FREQ
Posts: 3,547

Re: comparing distributions likelihood ratio

This does not directly answer your question, but you might find it helpful to read the documentation for teh SEVERITY procedure in SAS/ETS software. The SEVERITY procedure fits multiple models to data and provides statistics that you can assess to determine which model you want to use. It provides several likelihood statistics (-2LL, AIC, AICC, BIC) as well as ECDF statistics. It also provides graphical diagnostic plots to accompany the statistics.

 

 

Valued Guide
Valued Guide
Posts: 684

Re: comparing distributions likelihood ratio

You can conduct a LR test based on log-likelihoods if the two distributions are nested (i.e., if one is a special case of the other). For instance, the Poisson is a special case of the negative binomial (as 1/k =0, negative binomial = Poisson). In this example, the negative binomial has one more parameter than the Poisson (many sources use k as the overdispersion parameter of the negative binomial, but sas uses scale = 1/k in several procedures). The df for the LR is 1 because of the difference of parameters. LR is -2 times the difference in log-likelihoods. Under the null hypothesis (H0: distribution is the simpler one), the test statistic nominally has a chi-squared distribution. Caution: when the scale parameter is on boundary in order to get the simpler distribution, then the the test statistic may have a more complex distribution than a simple chi-squared (with 1 df). For instance, scale parameter ranges from 0 to infinity, and scale=0 gives you the simpler distribution. Thus, the more complex test statistic distribution. Many ignore this issue.  

Be careful with different procedures. Some programs may not give the the actual log-likelihood. For instance, many log-likelihoods can be written as sum of terms, where some terms invovle parameters and data, and some terms involve only the data (not the parameters). To be computationally efficient, the term not involving parameters may not be calculated or displayed. This is fine when one is comparing log-likelihoods all for the same distribution (with the same procedure), but could cause trouble if you are comparing distributions. 

 

Be careful with different procdures. If you use GLIMMIX (say, with different choices of distributions), make sure you are not using one of the conditional log-likelihood methods (rspl, mspl, ...). You need to be using the actual log-likelihood (method=quad). 

Occasional Contributor DDK
Occasional Contributor
Posts: 16

Re: comparing distributions likelihood ratio

Just a quick question. I understand that there is an extra parameter with the negative binomial and therefore should be df=1. Why is this not reflected in the Pearson Chi-square/DF statistic in the 'fit statistics for conditional distribution' section. The section mentions Pearson Chi-square and the result of the Pearson Chi-square/DF so I should be able to calculate the df. Or is this referring to a different df?

Solution
‎09-25-2015 06:23 AM
Valued Guide
Valued Guide
Posts: 684

Re: comparing distributions likelihood ratio

I didn't notice in your original post, but it looks like you are using GLIMMIX.  The df used in the PearsonChiSq/df calculation does not involve the scale parameter. The 0.1 or so difference you noticed in the df calculation is just rounding. The proc (also GENMOD) uses the same df for Poisson and NB. The LR test to compare distributions has to be done by hand (or in a data step using ODS output), using df=1. Use -2LL from two runs of the procedure.

Occasional Contributor DDK
Occasional Contributor
Posts: 16

Re: comparing distributions likelihood ratio

Ah, thanks, that clarifies it. What if you have repeated measurements (r side variance). Sas documentation states that this is not supported for method=quad. Is there a way around that in glimmix?

Valued Guide
Valued Guide
Posts: 684

Re: comparing distributions likelihood ratio

You have to use G-side covariance structure for the repeated measure with-normal distributions, when you use quadrature of Laplace estimation methods. The book by Walt Stroup on GLMMs is excellent on this topic (with lots of SAS code available on-line).

Occasional Contributor DDK
Occasional Contributor
Posts: 16

Re: comparing distributions likelihood ratio

Thanks for all the help. It is clarified now. Will look at your book suggestion.

Respected Advisor
Posts: 2,655

Re: comparing distributions likelihood ratio

The idea of testing for a better fit for a distribution is intriguing, but sounds like a lot of work when comparison of information criteria ought to do the trick on its own.  So long as the data, model and any random statements are the same, and the same link is used (and appropriate) for both distributions, AIC provides an excellent choice for distribution selection, in my experience. 

 

Steve Denham

🔒 This topic is solved and locked.

Need further help from the community? Please ask a new question.

Discussion stats
  • 8 replies
  • 419 views
  • 5 likes
  • 4 in conversation