Solved: Re: Counting Process and Interactions

andym90

Hello everyone,

I currently have a dataset in counting process form and I am running a PHREG on it. The dataset contains one type of event that is recurrent and fixed descriptors measured at start or "ever". I assumed for the fixed values, I should evaluate if they violate the PH assumption and I saw some did. My idea was to then make the variables that violated the PH assumptions interact with time. A lot of the literature out there talks about counting process datasets with time-dependent covariates but not in the sense of fixed covariates interacting with time (not an event). I assumed I could code this the same way I would without counting process data like below ("program style"). My code ran but I wonder if it is doing what I want it to do. It looks like (lets say x1 is a fixed continuous variable measured at entry (same for all IDs and violated the cox PH):

Proc phreg data=mydata covs(aggregate) multipass;

Model (start stop) * event(0) = x1 x1*time x2 x3/rl alpha=0.05;

Id ID;

Time=time_stop;

Hazardratio x /at (time= 0 1 2 3 4 5 6 7 8 9 10)

Run;

Is this correctly accounting for the interaction? Any clarity would be great!

JacobSimonsen

Hi,

I think instead you should use "stop" directly in the model line : "Model (start stop) * event(0) = x1 x1*stop x2 x3/rl alpha=0.05;" and you will then estimate x1 as being a constant + something linear with time. And you can then test if there is a significant time dependent term.

- though this work only under assumption that if there is time dependency, then its linear.

View solution in original post

JacobSimonsen

Hi,

I think instead you should use "stop" directly in the model line : "Model (start stop) * event(0) = x1 x1*stop x2 x3/rl alpha=0.05;" and you will then estimate x1 as being a constant + something linear with time. And you can then test if there is a significant time dependent term.

- though this work only under assumption that if there is time dependency, then its linear.

andym90

Both give approaches very different answers. I wonder why? I can't find one
example of someone applying an interaction in counting process data and
this has me so stumped. How is SAS handling the interaction term.

JacobSimonsen

I have not really understood what that confuse you. It should make no difference whether you use counting process style ( (entry exit)*event(0) ), or delayed entry style (using the entry as an option).

Also, it is not really an interaction term you have made. The x1*stop term just add an effect that is b multiplied on x1 multiplied on time. Then it test b = 0.

andym90

Sorry for any confusion (I accidentally marked this as solved and it wont undo), just to clarify and expand. I fit an Anderson Gill model to this counting process data containing a repeating event and a bunch of time-fixed covariates (for example, age at diagnosis). I assume that since Anderson-Gill is an extension of Cox PH, the fixed variables still need to align with the Cox PH assumptions. After I ran ZPH on my model and looked at a few curves, I noticed that some of these variables (for example, age at diagnosis) violated the PH assumptions and changed over time. This could also be due to my long follow up time.

To my understanding, one way to handle such a violation is to model the variable effect as being time-dependent using an interaction term with time. However, most of the examples I have read do not perform such an interaction on models using counting process data (that is clustered by ID).

I tried the following methods:

1 - (from my original)

Proc phreg data=mydata covs(aggregate) multipass;

Model (time_start time_stop) * event(0) = x1 x1*time x2 x3/rl alpha=0.05;

Id ID;

Time=time_stop;

Hazardratio x /at (time= 0 1 2 3 4 5 6 7 8 9 10)

Run;

2 - The method from your comment:

Proc phreg data=mydata covs(aggregate) multipass;

Model (time_start time_stop) * event(0) = x1 x1*time_stop x2 x3/rl alpha=0.05;

Id ID;

Hazardratio x /at (time= 0 1 2 3 4 5 6 7 8 9 10)

Run;

I get very different results for Hazards ratios at the different time points from each. I am just unsure which method is correct, and/o ifr the cox model is handling the interaction correctly. I am also unsure if my logic is making sense here as there is really limited information on if you even test an Anderson Gill model for Cox PH assumptions, as there is already a time-dependent variable in the model (the repeating event). I know another way to handle a violation is using piecewise regression splitting the variable up over the follow up at meaningful timepoints using an example like:

Proc phreg data=mydata covs(aggregate) multipass;

Model (time_start time_stop) * event(0) = x1time1 x1time2 x2 x3/rl alpha=0.05;

Id ID;

x1time1 = x1*(0<time_stop<=1);

x1time2 = x2*(time_stop>1);

Run;

But I am also unsure if this code is accurately splitting the model up over this parameter and clustering by ID in the counting process data.

Does that make sense? I just want to make sure I am capturing the time-dependent interaction correctly, and have no idea why the results are differing.

Thanks again for all your help and consideration.

Best