BookmarkSubscribeRSS Feed
LionelLeston
Calcite | Level 5

I am working for my PhD advisor on a post-doc, and as part of my duties, I am answering the statistics questions of my advisor's other students while my advisor is on sabbattical. One of the students is using generalized linear modeling (PROC GENMOD) to test for the best distribution for modeling data (normal, Poisson, negative binomial) prior to running a mixed model using that distribution and the same predictors in PROC GLIMMIX. In the past, my advisor has suggested using AIC and/or the deviance/df of a model to decide, preferring the lowest value of AIC and deviance/df closest to 1.0. Recently, my advisor has asked us to use graphical diagnostics of residuals in PROC GENMOD, just as we would when testing if data satisfy the statistical assumptions of ordinary least-squares regression in PROC REG.

The student knows how to graph residuals from generalized linear models in both PROC GENMOD and PROC UNIVARIATE after exporting model results to an output data set. Each time the student runs the same generalized linear model under a different distribution and creates a new data set, the residuals in each data set are different, although sometimes only slightly so. But when the residuals in each data set are graphed, the graphs (e.g. histograms, Q-Q plots) are identical for each set of residuals, although each set of residuals has different basic statistics. If the basic statistics do not differ that much, how likely is it that diagnostic graphs would look identical?

Furthermore, when the student tries this in PROC GLIMMIX instead of PROC GENMOD, (i.e. runs the same mixed model three times, each time under a different distribution), then runs graphical diagnostics on the residuals from each model, then she does get different Q-Q plots for each distribution. I suppose the difference could be in the mixed models the student is accounting for random effects due to repeated measurements from the same sites, but why do we see identical graphs when we don't include random effects in PROC GENMOD?

I have included the data and program as attachments.

2 REPLIES 2
SteveDenham
Jade | Level 19

The only thing I can offer here is to look at the plots of the Pearson residuals, rather than the raw residuals.  Scaling by the proper standard error for the different distributions may make the difference more apparent.

(No guarantees)

Steve Denham


sas-innovate-2024.png

Don't miss out on SAS Innovate - Register now for the FREE Livestream!

Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 2 replies
  • 598 views
  • 0 likes
  • 2 in conversation