BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
ammarhm
Lapis Lazuli | Level 10

Hi everyone,

Apologies if this is not the right place to ask this question. 

I am really confused at a higher level in terms of using Cox regression vs Poisson....

It is cleat to me that Cox regression models time to event, and Poisson regression models counts or rates of events. 
But the applications of these models are not always as clear...

Take this example:

I am looking at a cohort of cancer patients who had treatment X, all these patients were (initially) cured after treatment. However, some of them developed cancer recurrence after a period of time. I want to look at whether recurrence is higher in the group that has large cancer to start with.

So whether the patient had recurrence or not on follow up is important, and time to recurrence is important too here.

Traditionally I analysed this kind of data/question using Cox regression with tumour size (categorical variable, large vs small) included as an independent variable in the model.

proc phreg data=have ;
Title 'Cox for recurrence';
class Sex(ref='F')   Size(ref='Small') ;
model time*Censor(0)=   Age Sex Size /rl;
run;

However, one could argue that a Poisson regression could also be used in this this case (assuming the median and variance are equal in the sample), modelling the number of recurrences in the group with large tumour vs small tumour. 

proc glimmix data=have;
class Sex(ref='F')   Size(ref='Small');
model recurrence=Age Sex Size/ dist=poisson offset=logtime s cl link=log;
run;

I am really confused here, and would appreciate any opinion regarding how to choose the best approach. 

Kind regards

Am

1 ACCEPTED SOLUTION

Accepted Solutions
JacobSimonsen
Barite | Level 11

If you assume piecewise constant hazard rates, then the likelihood function (as a function of parameteres) has same form as if the number of events had been poisson distributed. Therefore, you can use poisson regression on time to event data where you have counts on left side in the model statement. If you chop the timeaxis into finer and finer pieces, then the model will be equivalent to a cox-regression, and in that case the difference is only that the parameter of the time-effect is non-parametric in the cox-regression while it will be estimated together with other parametes in the Poisson regression model.

 

Be aware, there is no assumption that the counts actually are poisson distributed - and they are obviously not since the count are limited by the number of subjects in the trial, whereas if it was poisson distributed then there would not be an upper limit. Still, it is called poisson regression because of the similarity of the likelihood functions. Since data are not poisson distributed, it will not give meaning to apply model check that rely on the poison distribution ("variance = mean" does not give sense here).

View solution in original post

6 REPLIES 6
Ksharp
Super User
Yeah. They have something in common.
If your data is not censored , Poisson Regression is also a right model , otherwise Life analysis.
ammarhm
Lapis Lazuli | Level 10

Thanks @Ksharp 

Some of the observations are censored, as some patients reached end of follow up time and did not have an event and some patients died of other causes. 

What i have read is that a Poisson regression could actually produce exactly the same results and parameter estimates as Cox model if you split p follow up time to very small intervals ... not sure if this is the case or how to do it in SAS though....

AM

Ksharp
Super User
Yes. I think so . I don't know if the parameter estimators is the same as Survival Analysis because I don't test it before.

But you really could use Poisson regression if data is NOT censored.
Check PROC GENMOD 's documentation,in it there is an example about using survival data to build a Poisson . Put time at OFFSET= option as a measure unit of COUNT.

proc genmod.......;
model count=x y z/ offset=time dist=poisson;
run;
ammarhm
Lapis Lazuli | Level 10

The more I read about this the more convinced I am that you can get the same results with either...

See this paper in R on Cox vs Poisson

http://bendixcarstensen.com/WntCma.pdf

 

Here is a quick example i made, but needs further fine tuning to split the followup time for the poisson model to smaller intervals (p values not exactly the same for both models). Any further edits are appreciated. As you can see the Risk ratios (HR or RR) are the same) 

 

* Example based on data from https://stats.idre.ucla.edu/sas/seminars/sas-survival/;
* Data whas500 can be downloaded form https://stats.idre.ucla.edu/sas/seminars/sas-survival/;
* I downloaded tge data to X:\Data

libname x 'x:\Data';


proc format ;
value gender
	  0 = "Male"
	  1 = "Female";
run;

data whas500;
set x.whas500;
format gender gender.;
logpyrs=log(lenfol);
run;

ods output parameterestimates=Cox;

proc phreg data = whas500 ;
class gender;
model lenfol*fstat(0) = gender age/rl;
run;

Data Cox2 ;
set Cox;
combined_RR=compress(put(exp(estimate),10.4) ||'(' ||put((HRLowerCL),10.4)||'-' || put((HRupperCL),10.4)||')');
Model='Cox';
p_value=ProbChiSq;
keep Model parameter combined_rR p_value;

run;

ods output parameterestimates=Poisson;
proc glimmix data=whas500 ;
model  fstat=gender age/ dist=poisson offset=logpyrs s cl link=log;
run;

Data poisson2 (rename=(effect=parameter));
set poisson;
combined_RR=compress(put(exp(estimate),10.4) ||'(' ||put(exp(lower),10.4)||'-' || put(exp(upper),10.4)||')');
Model='Poisson';
p_value=probt;
Keep model effect combined_rr p_value;
where effect ne 'Intercept';
run;
Data comb;
length  model $ 8;
set cox2 poisson2;
run;

proc sort data=comb;
by   parameter;
run;
proc print data=comb;
var model parameter combined_rr p_value;
run;

Here is the result, not exactly the same but pretty close. IF anyone can help splitting follow up time maybe we could get it even closer. 

 

Obs

Parameter

model

RR (95% CI)

p_value

1

AGE

Cox

1.0691 (1.0562-1.0822)

0.00000

2

AGE

Poisson

1.0758 (1.0626-1.0891)

0.00000

3

GENDER

Cox

0.9365 (0.7110-1.2336)

0.64096

4

GENDER

Poisson

0.9258 (0.7032-1.2187)

0.58169

 

Ksharp
Super User

Maybe @StatDave   @Rick_SAS  @lvm  could give you exact answer.

JacobSimonsen
Barite | Level 11

If you assume piecewise constant hazard rates, then the likelihood function (as a function of parameteres) has same form as if the number of events had been poisson distributed. Therefore, you can use poisson regression on time to event data where you have counts on left side in the model statement. If you chop the timeaxis into finer and finer pieces, then the model will be equivalent to a cox-regression, and in that case the difference is only that the parameter of the time-effect is non-parametric in the cox-regression while it will be estimated together with other parametes in the Poisson regression model.

 

Be aware, there is no assumption that the counts actually are poisson distributed - and they are obviously not since the count are limited by the number of subjects in the trial, whereas if it was poisson distributed then there would not be an upper limit. Still, it is called poisson regression because of the similarity of the likelihood functions. Since data are not poisson distributed, it will not give meaning to apply model check that rely on the poison distribution ("variance = mean" does not give sense here).

sas-innovate-2024.png

Don't miss out on SAS Innovate - Register now for the FREE Livestream!

Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.

 

Register now!

What is ANOVA?

ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 6 replies
  • 9780 views
  • 10 likes
  • 3 in conversation