Programming the statistical procedures from SAS

Calibration using GLIMMIX

New Contributor
Posts: 3

Calibration using GLIMMIX


I am using proc GLIMMIX to develop a model for mortality by a specified time-point (binary outcome of alive or dead at 30-days). The data are clustered by hospital so I am using PROC GLIMMIX to fit a random intercept model for the data using the reporting hospital as the subject variable.

In order to properly evaluate the performance of my models I would like to examine their calibration and discrimination. I found this article (41364 - ROC analysis for binary response models fit in the GLIMMIX, NLMIXED, GAM or other procedures) detailing how to create an ROC curve and get a c-statistic (i.e. area under the ROC curve) for examining model discrimination; however, I am still having problems figuring out how to get a good measure of calibration.

Proc GLIMMIX does not have the LACKFIT option to produce a Hosmer-Lemeshow statistic as in Proc Logistic (and I am fairly certain that this statistic is not appropriate to use with clustered data anyway). I am trying to figure out how to do something along the lines of producing a "plot of expected vs. observed mortality rates across deciles of increasing risk" but am having some trouble figuring out how to go about doing this.

Any help you could provide woul dbe greatly appreciated.

Thank you!

Respected Advisor
Posts: 2,655

Re: Calibration using GLIMMIX

Posted in reply to rhysticlight

The hard part is interpreting anything like an HL stat in light of the clustering.  I would suggest doing a within-hospital lack of fit test for each hospital, as well as one overall that essentially ignored the clustering.  If the latter shows a lack of fit, it might then be quickly identified as being due to a specific hospital.  I think all of these tests would have to be obtained from PROC LOGISTIC, first with a by hospital statement, and then without.

You might be able to take the within hospital p values as data for a generalized linear model with a beta distribution, and use the sample sizes as weights.  This might provide a better pooled representation than the "ignore the clustering" approach.

Good luck.

Steve Denham

Ask a Question
Discussion stats
  • 1 reply
  • 2 in conversation