Statistical Procedures

RyanD · Posted 03-29-2012 03:31 PM

I've been trying to follow the example here http://www.ats.ucla.edu/stat/sas/faq/relative_risk.htm to estimate relative risk by poisson regression with robust error variances, but I don't get the same criteria for assessing goodness of fit statistics shown at the link. I get only GEE Fit Criteria with QIC and QICu listed. I had used pearson chi-square value/df to determine if I was using the right stat process. For example, I was told that if the value/df was 1.25 or greater than I should use a negative binomial model (dist=negbin) rather than poisson. However, because I can't get pearson chi-square I don't know how to assess goodness of fit.

1. Can I get pearson chi-square stats using poisson regression as outlined in the link? If so, how?

2. Am I using the correct logic in assessing what procedure to use? The link also mentions log binomial but under what conditions? When I tried log binomial I didn't get criteria for assessing goodness of fit that I knew how to interpret.

Thanks,

Ryan

here's my SAS code

proc genmod data = nbscrBirthVars_recode descending;

class hosplevel NoCollege cesarean PreTerm LBW NICU UnMarried teen Over40 id;

model NoBreastfeed = hosplevel NoCollege cesarean PreTerm LBW NICU UnMarried teen Over40/ dist = poisson link = log;

repeated subject = id/ type = unstr;

estimate 'NonTTS vs. BF' hosplevel 1 0 -1/ exp;

estimate 'TTS vs. BF' hosplevel 0 1 -1/ exp;

estimate 'NoCollege' NoCollege 1 -1/ exp;

estimate 'cesarean' cesarean 1 -1/ exp;

estimate 'PreTerm' PreTerm 1 -1/ exp;

estimate 'LBW' LBW 1 -1/ exp;

estimate 'NICU' NICU 1 -1/ exp;

estimate 'UnMarried' UnMarried 1 -1/ exp;

estimate 'teen' teen 1 -1/ exp;

estimate 'Over40' Over40 1 -1/ exp;

run;

lvm · Posted 03-29-2012 08:09 PM

The User's Guide for GENMOD says that you do not get the Pearson chi-square and df ratio when you use a REPEATED statement. When you use a repeated statement, you are essentially rescalling your data so that the variability is comparable to that found for a Poisson (or whatever distribution is specified). The Pearson chi-square/df would have no meaning.

View solution in original post

PGStats · Posted 03-29-2012 04:59 PM

Guessing the meaning of your variables from their names, the folloowing question springs to mind: Why not DIST=BINOMIAL LINK=LOGIT ?

PG

RyanD · Posted 04-02-2012 08:24 AM

I'm looking for a risk ratio rather than an odds ratio. It's my understanding that link=logit is equivalent to proc logistic, which is what I don't want.

lvm · Posted 03-29-2012 08:09 PM

The User's Guide for GENMOD says that you do not get the Pearson chi-square and df ratio when you use a REPEATED statement. When you use a repeated statement, you are essentially rescalling your data so that the variability is comparable to that found for a Poisson (or whatever distribution is specified). The Pearson chi-square/df would have no meaning.

RyanD · Posted 04-02-2012 09:15 AM

Thank you. You're right. I tried it without the repeated statement and got the Pearson chi-square and df ratio. The instructions I'm following using the link above use the repeated statement with robust error variance

(Zou G. A Modified Poisson Regression Approach to Prospective Studies with Binary Data. Am J Epidemiol 2004; 159(7):702-6.). It's interesting that the output shown in the example using the same coding contains Pearson chi-square and df ratio.

Can I interpret your words,"rescalling your data so that the variability is comparable to that found for a Poisson", to mean that this approach makes my data fit poisson, thus no goodness of fit test is necessary?

Thanks again. I appreciate your help?

proctice · Posted 08-16-2016 08:13 PM

I hope it is ok to reply to an old thread, but I had the same question about my fit statistics disappearing when I add a repeated statement. Is overdispersion not an issue in GEE? How would I choose between Poisson, Negative Binomial, Zero-Inflated, etc. in a clustered data context? Thank you.

SteveDenham · Posted 08-24-2016 07:20 AM

Going out on a limb here, but if you fit the repeated nature as a G-side matrix in PROC GLIMMIX, and use method=laplace or method=quad, you will get quasi-likelihood information criteria, which could be used to rank the distributions, provided there is no difference in the fixed effects part of the model AND an identical link function is used. This would work for comparing Poisson to negative binomial. Unfortunately, the zero inflated models aren't easily fit in GLIMMIX, and this method probably isn't a good method to compare them in any case, as they are truly mixture models, so the resulting quasi-likelihood won't reflect the same "data".

Something more general using PROC NLMIXED could probably be done, but my brain is turning to mush and I don't have a link to that right at hand.

Steve Denham

Statistical Procedures

poisson regression goodness of fit stats

poisson regression goodness of fit stats

poisson regression goodness of fit stats

poisson regression goodness of fit stats

poisson regression goodness of fit stats

poisson regression goodness of fit stats

Re: poisson regression goodness of fit stats

Re: poisson regression goodness of fit stats

Poisson Regressions for Complex Surveys

poisson regression goodness of fit stats

Fitting Bayesian Zero-Inflated Poisson Regression Models with the MCMC...

Fitting Tweedie’s Compound Poisson-Gamma Mixture Model by Using PROC H...

Bayesian Hierarchical Poisson Regression Model for Overdispersed Count...

Follow Us

What is...

Statistical Procedures

Register Today!

Follow Us

What is...