BookmarkSubscribeRSS Feed
Adam1
Calcite | Level 5

Hi

I'm modeling time to event data with Cox regression (PROC PHREG). I have used PROC PHREG before without running into trouble.

Event of interest: heart failure.

Baseline variables: age, sex.

Time-dependent variables: cholesterol, blood pressure, blood glucose.

Each individual has several observations with start and stop time for each of these (see table below)..

I have checked my data set, it is most likely correctly configured and looks like the table below.

Showing an individual who experienced an event between his 3rd and 4th observation.

Subjectstart_timestop_timecholesterolevent
10220
12620
16740
171050
1101330
1131631

Now, you might notice the event indicator shows that the individual had an event on his last observation, despite the event occuring between the 3rd and 4th observation. This was instructed to me by an experienced statician; she told me that, in a time-dependent cox regression with counting process syntax, individuals who experience the event should have that indicated on their last observation; regardless of when the event occured.


This is somewhat non intuitive and i wonder whether this is correct. Particularly since my results show that higher cholesterol, higher blood pressure and increasing age decreases the hazard ratio; this is certainly wrong.

Is it something I am forgetting?

I would appreciate some advice on this.

5 REPLIES 5
Doc_Duke
Rhodochrosite | Level 12

Adam,

I haven't done one of these in quite a while (long enough ago that the counting process syntax wasn't available), so I'll venture a guess.

It appears that what you are interested in is the 'initiation' of heart failure (in stochastic processes, heart failure would be an "absorbing event" -- once you have heart failure, you can never again 'not' have it).  To me, that is the only way that having a single event makes sense.   In that case, covariates after the onset of heart failure do not contribute to your understanding of the incidence and should be discarded.  Then the event would be associated with the interval in which it occurred as well as the last interval.

This would also make the data consistent with what your local statistician said.

Doc Muhlbaier

Duke

Adam1
Calcite | Level 5

Thank You Muhlbaier,

I've carefully read your reply and I agree with it. However, it might be that the PROC PHREG handles the procedure this way. Moreover, I have not found a single SUGI paper commenting on this, neither in Paul Allisons book. Searched google extensively aswell.

I will try your solution though, and get back to you with the results.

Thanks.

MarcHuber
SAS Employee

Hi, Adam,

This has a lot to do with your definitiion of event.  If event is defined as the onset of heart failure and that, by definition, can only occur once (you and Doc Muhlbaier would know better than I), then the remaining observations after the event occurrence only add bias by inflating the risk set at times where some individuals are no longer at risk (they have heart failure, but are no longer at risk of having the onset of heart failure, just like someone who died will always be dead, but will no longer be at risk for the even to dying)..  If this is the case here (only one event can occur and it is defined to have occurred at onset), then the indicator for event should be 1 for the interval that ends at onset.  So, if they had the event at time 7, then event should be 1 at the third interval.  In addition, for this analysis, the remaining observations beyond the third observation should not be included in the analysis.  Unless they are at risk for the onset of heart failure again, anything that occurs after onset is to be ignored.

Now, if heart failure were a repeatable event, then you would still have a 1 for event at the third interval, but you could also have a 1 at later intervals where the event occurred again.  Like I said, I am not an expert on heart failure, so I don't know what can happen.  However, I trust Doc Muhlbaier in his or her assertion that it cannot recur, since it cannot be undone (except by transplant?).

What you currently have (what your statistician friend suggested) can only be correct if the event itself is defined to have occurred at time 16.  So, for instance, if 'event' is defined to be death and that occcurred for this subject at time 16, then this might be correct.  If that were the case, then you can use heart failure as a predictor of death and then heart failure would be a separate indicator variable that can be used as a predictor.  It could be coded as '0' for all time intervals preceding onset of heart failure and 1 for all interals at onset until death (or censoring).

If you really have a repeatable event, then there are several models to choose from.  In most cases, you'd need to have another variable for event number, to indicate which event a subject is currently at risk for, and you'd have to adjust the model for the lack of independence of events within subject.

Marc

JacobSimonsen
Barite | Level 11

I completely agree with Marc's reply. The event should be at the observation where the event took place. The stop-time at that observation should be time of event, and this timeinterval has to be the last one for that individual.

I think it was also what you where advised to do by your statistician (the event should be at the last interval for an individual). Intervals after the event took place should be deleted.

Jacob

Adam1
Calcite | Level 5

Hi again

It is still unclear to me why the statical recommended the fore mentioned data layout for individuals with events. It is now clear to me, after the above explanations and further discussions with other staticians that: individuals with events need no observations after the event time if the event is permanent (in this case, once heart failure, always heart failure).

I appreciate your answers above!

Thank you

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

What is ANOVA?

ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 5 replies
  • 1820 views
  • 0 likes
  • 4 in conversation