Auto-suggest helps you quickly narrow down your search results by suggesting possible matches as you type.

Showing results for

- Home
- /
- Analytics
- /
- Stat Procs
- /
- Interpreting fit statistics in proc glimmix with a binary outcome and ...

Options

- RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Mute
- Printer Friendly Page

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

Posted 11-15-2023 05:09 AM
(619 views)

Hi everyone, I am a non-statistician looking for some advice on how to intepret the fit statistics in proc glimmix.

I am computing odds ratios for an event (0/1) over time in the same individuals.

Data looks something like this:

ID | period | outcome | season | prev_treated |

1 | 1 | 0 | 3 | 0 |

1 | 2 | 0 | 4 | 0 |

1 | 3 | 1 | 1 | 0 |

1 | 4 | 1 | 2 | 0 |

2 | 1 | 0 | 2 | 1 |

2 | 2 | 1 | 3 | 1 |

It might be worth adding that the share of observations in which outcome=1 is small, approx 10-20%.

The current model looks like this:

```
proc glimmix data = data1 method=rspl plots=oddsratio;
class ID period(ref="1") season prev_treated;
model outcome(event="1")= period season prev_treated / dist=binary link=logit oddsratio s;
random intercept / subject=id;
random period/subject=id residual type=AR(1);
run;
```

I have two questions :

1) Is there any way, based on this information, to determine which method should be used (RSPL, RMPL, MSPL, MMPL)?

2) In relation to 1), how do I interpret the Fit statistics table:

- -2 res log pseudolikelihood
- Generalized Chi-Square
- Gener. Chi-Square/DF

Meaning can it be used like AIC, where lower is better, for example when specifyiung different methods in the method= statement or covariance structures in the type= statement. (AR(1), ARMA (1,1) and TOEP are of interest).

Also feel free to comment on the model, if you have other suggestions.

Thanks

6 REPLIES 6

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

Hello,

- -2 Res Log Pseudo-Likelihood : The likelihood is preceded by the word “Pseudo” to indicate that it is computed from a pseudo-likelihood, rather than the true likelihood.
- Gener. Chi-Square / DF : The ratio of the generalized chi-square statistic and its degrees of freedom should be close to 1. This would indicate that the variability in your data has been properly modeled, and that there is no residual overdispersion.
- Generalized Chi-Square : The generalized chi-square statistic is a quadratic form in the marginal residuals that takes correlations among the data into account.

BR, Koen

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

Just a quick comment on covariance structure selection. If you use any of the pseudo-likelihood methods, the information criteria probably should not be used for selection, as the pseudo-likelihood estimates aren't the same under various structures. Thus @StatsMan 's comments re LAPLACE or QUADRATURE. If you truly want to use pseudo-likelihood methods, then probably the best you can do for covariance structure selection is look at the Gener. Chi-Square / DF value, and pick the structure that has the least over- or under-dispersion. You should note that this measure will get closer to 1 the more variables are estimated, and there is no penalization for this as there is for the information criteria, so "Caveat emptor" - let the user (buyer) beware.

SteveDenham

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

Unfortunately there are no good ways that I am aware of to do what you asked for with your model.

Thanks,

Jill

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

Would a two step method be a possibility? Step 1: Use LAPLACE or QUAD (if you have enough data) to fit the RANDOM effects, and output the variance/covariance parameter estimates to a dataset. This would enable selection of an error structure with the smallest corrected AIC. Step 2.Fit your current model using the pseudolikelihood method and a residual R side effect for the repeated factor. You could use the values obtained in the first step as starting values in a PARMS statement.

**NOTE WELL: THIS IS UNTESTED AND THERE IS NO GUARANTEE THAT IT WILL SOLVE THE PROBLEM**

Additionally, you should consider that since this is a GLMM with a binary distribution the best approach may be to do this all as a G side analysis.

SteveDenham

📢

**ANNOUNCEMENT**

The early bird rate has been extended! Register by March 18 for just $695 - $100 off the standard rate.

Check out the agenda and get ready for a jam-packed event featuring workshops, super demos, breakout sessions, roundtables, inspiring keynotes and incredible networking events.** **

What is ANOVA?

ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.

Find more tutorials on the SAS Users YouTube channel.