BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
BTAinRVA
Quartz | Level 8

I'm running the code below and I get a deviance/df value of 1.4. Is that considered to be serious enough that I need to correct for overdispersion?

 

proc genmod data = icu.final_exposure;

class exposure;

model mh = exposure/ dist = poisson link = log offest = lnpt;

estimate 'logrr' exposure 1 / exp;

lsmeans exposure / ilink cl;

run;

 

Thanks,

Brian

1 ACCEPTED SOLUTION

Accepted Solutions
BTAinRVA
Quartz | Level 8

Steve,

 

Thanks again for your valuable insight!

 

Brian

View solution in original post

4 REPLIES 4
SteveDenham
Jade | Level 19

I would vote no.  I don't think you really have overdispersion until you start getting double digit values for deviance/df.  However, you may want to investigate a negative binomial distribution, just in case.  

 

Steve Denham

BTAinRVA
Quartz | Level 8

Steve,

 

Thanks again for your valuable insight!

 

Brian

JacobSimonsen
Barite | Level 11

I will also vote not, but for other reason that what @SteveDenham mention.

 

You can completely ignore overdispersion in such Poisson regression model. The reason is that the data doesn't need to be Poisson distributed. Actually, the data which is behind your number of events is time-to-event data. If the assumption of piecewise constant rates are fullfilled, then data can be analyzed by poisson regression because the likelihood function in the Poisson regression is exactly the likelihood function you want to maximize if you had the original time-to-event data. Therefore, it is wrong to use Poisson regression in such model to validate the distribution of data, it is only a trick to maximize the likelihood function and thereby make estimates and relevant hyphotesis testing about covariates.

 

It is actually quite easy to verify: simulate n datapoints from exponential distribution then cumulate the values. you can now estimate the rate using poisson regression (model n=/dist=poisson link=log offset=logcumtime). In such model it is obvious that it is meaning less to talk about overdispersion even that the dispersion index will be showed. So just forget about dispersion in Poisson regression.

 

If the data was truly count-data, then it is much more relevant to look on the assumption of poisondistributed data, and then the dispersion index is much more relevant.

JacobSimonsen
Barite | Level 11

In this example I illustrate my point my simulate data from exponential distributions. Using the similary of likelihood functions I can estimate the rate by Poisson regression.  Note that I dont fit random observations as what I have on left side of the model is the number of observations, Therefore it is meaningless to talk about Poisson distributed observations and so it will not make sense to verify if data is Poisson distributed.

 

data silly_data;
  do group=1 to 2;
    do i=1 to 1000;
	  event=1;
      time=rand('exponential',4); 
	  logtime=log(time);
	  output;
	end;
  end;
run;
proc summary data=silly_data nway;
  var time;
  class group;
  output out=summary sum=sumtime;
ruN;
data summary; 
set summary;
logtime=log(sumtime);
run;
*estimate the rate paramater using aggregated form of the data;
proc genmod data=summary;
  model _freq_=/dist=poisson link=log dist=poisson offset=logtime;
  estimate 'rate' intercept 1;
run;


*same estimate can be obtained by using unaggregated form of the data;
proc genmod data=silly_data;
  model event=/dist=poisson link=log dist=poisson offset=logtime;
  estimate 'rate' intercept 1;
run;

sas-innovate-2024.png

Don't miss out on SAS Innovate - Register now for the FREE Livestream!

Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.

 

Register now!

What is ANOVA?

ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 4 replies
  • 3181 views
  • 2 likes
  • 3 in conversation