Programming the statistical procedures from SAS

"Fit statistics based on pseudo-likelihoods are not useful for comparing models that differ in their pseudo-data"

Reply
Respected Advisor
Posts: 4,754

"Fit statistics based on pseudo-likelihoods are not useful for comparing models that differ in their pseudo-data"

That note is from Proc Glimmix. Is there a systematic way to compare the "pseudo-data" from two models?

PG

PG
Respected Advisor
Posts: 2,655

Re: "Fit statistics based on pseudo-likelihoods are not useful for comparing models that differ in their pseudo-data"

Not yet agreed upon, which makes fitting R side models with different covariance structures a problem if the mean and variance are functionally related (i.e., every distribution available except normal and lognormal).

As far as actually comparing the pseudo-data, the closest I can imagine is comparing the residuals between competing models.  That would give the difference between the "converged pseudo-data" and the original data, so with different structures you would get different residuals.  Maybe from there one could calculate a PRESS value that could be compared.  Until someone (probably someone active in the lmer/glmer listserves in the R communities) gets around to it, I think the most logical thing to do is avoid the pseudo-data if at all possible, and use numerical integration methods.

Steve Denham

Respected Advisor
Posts: 4,754

Re: "Fit statistics based on pseudo-likelihoods are not useful for comparing models that differ in their pseudo-data"

Thanks, Steve. Your explanation is way above my head! Can you suggest a reference to help me understand the concept of pseudo-data? - PG

PG
Respected Advisor
Posts: 2,655

Re: "Fit statistics based on pseudo-likelihoods are not useful for comparing models that differ in their pseudo-data"

Start with the first edition of SAS for Mixed Models by Littell et al., and the precursor to PROC GLIMMIX--the %GLIMMIX macro.  It is in Chapter 11, and all of the code is in the back somewhere.

The pseudo-data is best defined in Wolfinger and O'Connell (1993) "Generalized Linear Mixed Models: A Pseudo-Likelihood Approach", Journal of Statistical Computation and Simulation, 48:233-243..

And there is a very good matrix intense presentation in Chapter 4.5.1 of Walt Stroup's Generalized Linear Mixed Models.

The pseudo-data are the linearized elements achieved after each step of the optimization, and so are dependent on the covariance structure imposed in the process--so IC values, and even likelihood ratio tests are, well, somewhat questionable when the expected value and the variance are functionally related and non-separable as for Gaussian and lognormal distributions.

Steve Denham

Ask a Question
Discussion stats
  • 3 replies
  • 362 views
  • 6 likes
  • 2 in conversation