turn on suggestions

Auto-suggest helps you quickly narrow down your search results by suggesting possible matches as you type.

Showing results for

Find a Community

- Home
- /
- Analytics
- /
- Stat Procs
- /
- poisson regression goodness of fit stats

Topic Options

- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

03-29-2012 03:31 PM

I've been trying to follow the example here http://www.ats.ucla.edu/stat/sas/faq/relative_risk.htm to estimate relative risk by poisson regression with robust error variances, but I don't get the same criteria for assessing goodness of fit statistics shown at the link. I get only GEE Fit Criteria with QIC and QICu listed. I had used pearson chi-square value/df to determine if I was using the right stat process. For example, I was told that if the value/df was 1.25 or greater than I should use a negative binomial model (dist=negbin) rather than poisson. However, because I can't get pearson chi-square I don't know how to assess goodness of fit.

1. Can I get pearson chi-square stats using poisson regression as outlined in the link? If so, how?

2. Am I using the correct logic in assessing what procedure to use? The link also mentions log binomial but under what conditions? When I tried log binomial I didn't get criteria for assessing goodness of fit that I knew how to interpret.

Thanks,

Ryan

here's my SAS code

**proc** **genmod** data = nbscrBirthVars_recode descending;

class hosplevel NoCollege cesarean PreTerm LBW NICU UnMarried teen Over40 id;

model NoBreastfeed = hosplevel NoCollege cesarean PreTerm LBW NICU UnMarried teen Over40/ dist = poisson link = log;

repeated subject = id/ type = unstr;

estimate 'NonTTS vs. BF' hosplevel **1** **0** -**1**/ exp;

estimate 'TTS vs. BF' hosplevel **0** **1** -**1**/ exp;

estimate 'NoCollege' NoCollege **1** -**1**/ exp;

estimate 'cesarean' cesarean **1** -**1**/ exp;

estimate 'PreTerm' PreTerm **1** -**1**/ exp;

estimate 'LBW' LBW **1** -**1**/ exp;

estimate 'NICU' NICU **1** -**1**/ exp;

estimate 'UnMarried' UnMarried **1** -**1**/ exp;

estimate 'teen' teen **1** -**1**/ exp;

estimate 'Over40' Over40 **1** -**1**/ exp;

**run**;

Accepted Solutions

Solution

03-29-2012
08:09 PM

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to RyanD

03-29-2012 08:09 PM

All Replies

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to RyanD

03-29-2012 04:59 PM

Guessing the meaning of your variables from their names, the folloowing question springs to mind: Why not DIST=BINOMIAL LINK=LOGIT ?

PG

PG

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to PGStats

04-02-2012 08:24 AM

I'm looking for a risk ratio rather than an odds ratio. It's my understanding that link=logit is equivalent to proc logistic, which is what I don't want.

Solution

03-29-2012
08:09 PM

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to RyanD

03-29-2012 08:09 PM

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

04-02-2012 09:15 AM

Thank you. You're right. I tried it without the repeated statement and got the Pearson chi-square and df ratio. The instructions I'm following using the link above use the repeated statement with robust error variance

(Zou G. A Modified Poisson Regression Approach to Prospective Studies with Binary Data. Am J Epidemiol 2004; 159(7):702-6.). It's interesting that the output shown in the example using the same coding contains Pearson chi-square and df ratio.

Can I interpret your words,"rescalling your data so that the variability is comparable to that found for a Poisson", to mean that this approach makes my data fit poisson, thus no goodness of fit test is necessary?

Thanks again. I appreciate your help?

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to RyanD

08-16-2016 08:13 PM

I hope it is ok to reply to an old thread, but I had the same question about my fit statistics disappearing when I add a repeated statement. Is overdispersion not an issue in GEE? How would I choose between Poisson, Negative Binomial, Zero-Inflated, etc. in a clustered data context? Thank you.

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to proctice

08-24-2016 07:20 AM

Going out on a limb here, but if you fit the repeated nature as a G-side matrix in PROC GLIMMIX, and use method=laplace or method=quad, you will get quasi-likelihood information criteria, which could be used to rank the distributions, provided there is no difference in the fixed effects part of the model AND an identical link function is used. This would work for comparing Poisson to negative binomial. Unfortunately, the zero inflated models aren't easily fit in GLIMMIX, and this method probably isn't a good method to compare them in any case, as they are truly mixture models, so the resulting quasi-likelihood won't reflect the same "data".

Something more general using PROC NLMIXED could probably be done, but my brain is turning to mush and I don't have a link to that right at hand.

Steve Denham