- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
I've been trying to follow the example here http://www.ats.ucla.edu/stat/sas/faq/relative_risk.htm to estimate relative risk by poisson regression with robust error variances, but I don't get the same criteria for assessing goodness of fit statistics shown at the link. I get only GEE Fit Criteria with QIC and QICu listed. I had used pearson chi-square value/df to determine if I was using the right stat process. For example, I was told that if the value/df was 1.25 or greater than I should use a negative binomial model (dist=negbin) rather than poisson. However, because I can't get pearson chi-square I don't know how to assess goodness of fit.
1. Can I get pearson chi-square stats using poisson regression as outlined in the link? If so, how?
2. Am I using the correct logic in assessing what procedure to use? The link also mentions log binomial but under what conditions? When I tried log binomial I didn't get criteria for assessing goodness of fit that I knew how to interpret.
Thanks,
Ryan
here's my SAS code
proc genmod data = nbscrBirthVars_recode descending;
class hosplevel NoCollege cesarean PreTerm LBW NICU UnMarried teen Over40 id;
model NoBreastfeed = hosplevel NoCollege cesarean PreTerm LBW NICU UnMarried teen Over40/ dist = poisson link = log;
repeated subject = id/ type = unstr;
estimate 'NonTTS vs. BF' hosplevel 1 0 -1/ exp;
estimate 'TTS vs. BF' hosplevel 0 1 -1/ exp;
estimate 'NoCollege' NoCollege 1 -1/ exp;
estimate 'cesarean' cesarean 1 -1/ exp;
estimate 'PreTerm' PreTerm 1 -1/ exp;
estimate 'LBW' LBW 1 -1/ exp;
estimate 'NICU' NICU 1 -1/ exp;
estimate 'UnMarried' UnMarried 1 -1/ exp;
estimate 'teen' teen 1 -1/ exp;
estimate 'Over40' Over40 1 -1/ exp;
run;
Accepted Solutions
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
The User's Guide for GENMOD says that you do not get the Pearson chi-square and df ratio when you use a REPEATED statement. When you use a repeated statement, you are essentially rescalling your data so that the variability is comparable to that found for a Poisson (or whatever distribution is specified). The Pearson chi-square/df would have no meaning.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Guessing the meaning of your variables from their names, the folloowing question springs to mind: Why not DIST=BINOMIAL LINK=LOGIT ?
PG
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
I'm looking for a risk ratio rather than an odds ratio. It's my understanding that link=logit is equivalent to proc logistic, which is what I don't want.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
The User's Guide for GENMOD says that you do not get the Pearson chi-square and df ratio when you use a REPEATED statement. When you use a repeated statement, you are essentially rescalling your data so that the variability is comparable to that found for a Poisson (or whatever distribution is specified). The Pearson chi-square/df would have no meaning.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Thank you. You're right. I tried it without the repeated statement and got the Pearson chi-square and df ratio. The instructions I'm following using the link above use the repeated statement with robust error variance
(Zou G. A Modified Poisson Regression Approach to Prospective Studies with Binary Data. Am J Epidemiol 2004; 159(7):702-6.). It's interesting that the output shown in the example using the same coding contains Pearson chi-square and df ratio.
Can I interpret your words,"rescalling your data so that the variability is comparable to that found for a Poisson", to mean that this approach makes my data fit poisson, thus no goodness of fit test is necessary?
Thanks again. I appreciate your help?
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
I hope it is ok to reply to an old thread, but I had the same question about my fit statistics disappearing when I add a repeated statement. Is overdispersion not an issue in GEE? How would I choose between Poisson, Negative Binomial, Zero-Inflated, etc. in a clustered data context? Thank you.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Going out on a limb here, but if you fit the repeated nature as a G-side matrix in PROC GLIMMIX, and use method=laplace or method=quad, you will get quasi-likelihood information criteria, which could be used to rank the distributions, provided there is no difference in the fixed effects part of the model AND an identical link function is used. This would work for comparing Poisson to negative binomial. Unfortunately, the zero inflated models aren't easily fit in GLIMMIX, and this method probably isn't a good method to compare them in any case, as they are truly mixture models, so the resulting quasi-likelihood won't reflect the same "data".
Something more general using PROC NLMIXED could probably be done, but my brain is turning to mush and I don't have a link to that right at hand.
Steve Denham