Solved: interpreting cox regression

Kyra · Posted 11-11-2019 04:00 PM

Hi,

I am trying to evaluate the characteristics of small bowel neuroendocrine tumor (SB-NET) patients associated with progression after multimodal surgical resection.

I tried to first create a table for descriptive analysis of patient groups (progressed group vs stable group) and then did univariate cox regression modelling for progression.

For age at diagnosis, i did below for descriptive table:

proc univariate data=red.net1;

class progression1;

var Age_at_diagnosis1 ;

run;

proc NPAR1WAY data=red.net1 wilcoxon;

class progression1;

var Age_at_diagnosis1;

run;

I found

	Stable (53)	Progression (42)	p- value
Age at diagnosis	59 (50-65)	59.5 (52-67)	0.4739

For univariate cox regression i did below:

proc phreg data=red.net2;

model time2*progression1(0)=Age_at_diagnosis1 /rl;

run;

The result i got is:

Univariate Cox regression Results for Progression
Variable	at risk	HR	95% CI	p-value
Age_at_diagnosis	95	1.042	1.014 - 1.070	0.0032

I am wondering that on descriptive test, age at diagnosis is similar in two groups but on univariate cox regression, it is significant.

Can you please help me understand the reasoning behind this.

Thanks!

FreelanceReinh · Posted 11-12-2019 06:18 AM

Hi @Kyra,

The statistical tests you are comparing have very different null (and alternative) hypotheses: The Wilcoxon rank sum test compares the locations of the age distributions between the two groups (H₀: no location shift). It is "static" in the sense that it ignores time (i.e. variable time2).

For Cox regression, however, time is crucial as it examines the impact of (in your example) age at diagnosis on the time to progression, taking censoring into account (H₀: age has no impact on progression hazard). Unlike the Wilcoxon test, it could be performed (and yield a significant result) even without the patients in group "Stable" -- just based on the relationship between age and time to progression among the patients experiencing disease progression. So, by changing time2 values you could modify the significance of age at diagnosis substantially while the result of the Wilcoxon test, of course, would remain unchanged.

View solution in original post

Reeza · Posted 11-11-2019 05:49 PM

Look at the distributions/histograms of the data as well. Check how the overlaps of the curve appear. The N/sample size is important as well.

FreelanceReinh · Posted 11-12-2019 06:18 AM

Hi @Kyra,

The statistical tests you are comparing have very different null (and alternative) hypotheses: The Wilcoxon rank sum test compares the locations of the age distributions between the two groups (H₀: no location shift). It is "static" in the sense that it ignores time (i.e. variable time2).

For Cox regression, however, time is crucial as it examines the impact of (in your example) age at diagnosis on the time to progression, taking censoring into account (H₀: age has no impact on progression hazard). Unlike the Wilcoxon test, it could be performed (and yield a significant result) even without the patients in group "Stable" -- just based on the relationship between age and time to progression among the patients experiencing disease progression. So, by changing time2 values you could modify the significance of age at diagnosis substantially while the result of the Wilcoxon test, of course, would remain unchanged.

Kyra · Posted 11-12-2019 07:43 AM

Thank you very much. I really appreciate your help!

Kyra · Posted 11-12-2019 11:19 AM

Thank you very much for the help!

One more follow up question-

For the same patient population i tried to calculate median duration of follow-up . Used proc univariate .

for the stable group median follow up from surgery to last follow-up is 2.1 years (IQR, 0.5-4.6)

for progression group, median duration of follow-up from surgery to progression is 1.6 years (IQR, 0.9- 5.7)

for overall cohort, median duration of follow-up is 2.1 years (0.7, 5.4)

The median PFS for the entire cohort was 6.06 years (95% Cl, 3.73-9.21) using the code

proc lifetest data=red.data;

time time2*progression1(0);

run;

I am wondering how do we explain median follow-up of 2.1 years and PFS of 6.06 years. How is PFS so large when median follow up is only 2.1 years.

Thank you very much in advance. Really appreciate all your help!

FreelanceReinh · Posted 11-12-2019 01:32 PM

You're welcome.

@Kyra wrote:

I am wondering how do we explain median follow-up of 2.1 years and PFS of 6.06 years. How is PFS so large when median follow up is only 2.1 years.

This is just the effect of right-censoring. Note that the entire "Stable" group (i.e. the majority of patients, 53 of 95) is regarded as censored, so their Kaplan-Meier estimates have a substantial increasing effect on the estimated median progression-free survival. The discrepancy between median follow-up and estimated median PFS would be smaller if only a few patients were censored and it would vanish if no censoring had occurred. You can see this by including fewer and fewer "Stable" patients in the calculations. In the extreme case of restricting both calculations to the "Progression" group, i.e., using

where progression1=1;

the two medians will be identical.

Kyra · Posted 11-12-2019 01:36 PM

Thank you very much for the reply! I am very very grateful to you and the community!

interpreting cox regression

Re: interpreting cox regression

Re: interpreting cox regression

Re: interpreting cox regression

Re: interpreting cox regression

Re: interpreting cox regression

Re: interpreting cox regression

Re: interpreting cox regression

SAS Innovate 2026 Registration is Open

SAS Training: Just a Click Away