Hi,
I am trying to evaluate the characteristics of small bowel neuroendocrine tumor (SB-NET) patients associated with progression after multimodal surgical resection.
I tried to first create a table for descriptive analysis of patient groups (progressed group vs stable group) and then did univariate cox regression modelling for progression.
For age at diagnosis, i did below for descriptive table:
proc univariate data=red.net1;
class progression1;
var Age_at_diagnosis1 ;
run;
proc NPAR1WAY data=red.net1 wilcoxon;
class progression1;
var Age_at_diagnosis1;
run;
I found
| Stable (53) | Progression (42) | p- value |
Age at diagnosis | 59 (50-65) | 59.5 (52-67) | 0.4739 |
For univariate cox regression i did below:
proc phreg data=red.net2;
model time2*progression1(0)=Age_at_diagnosis1 /rl;
run;
The result i got is:
Univariate Cox regression Results for Progression | ||||
Variable | at risk | HR | 95% CI | p-value |
Age_at_diagnosis | 95 | 1.042 | 1.014 - 1.070 | 0.0032 |
I am wondering that on descriptive test, age at diagnosis is similar in two groups but on univariate cox regression, it is significant.
Can you please help me understand the reasoning behind this.
Thanks!
Hi @Kyra,
The statistical tests you are comparing have very different null (and alternative) hypotheses: The Wilcoxon rank sum test compares the locations of the age distributions between the two groups (H0: no location shift). It is "static" in the sense that it ignores time (i.e. variable time2).
For Cox regression, however, time is crucial as it examines the impact of (in your example) age at diagnosis on the time to progression, taking censoring into account (H0: age has no impact on progression hazard). Unlike the Wilcoxon test, it could be performed (and yield a significant result) even without the patients in group "Stable" -- just based on the relationship between age and time to progression among the patients experiencing disease progression. So, by changing time2 values you could modify the significance of age at diagnosis substantially while the result of the Wilcoxon test, of course, would remain unchanged.
Hi @Kyra,
The statistical tests you are comparing have very different null (and alternative) hypotheses: The Wilcoxon rank sum test compares the locations of the age distributions between the two groups (H0: no location shift). It is "static" in the sense that it ignores time (i.e. variable time2).
For Cox regression, however, time is crucial as it examines the impact of (in your example) age at diagnosis on the time to progression, taking censoring into account (H0: age has no impact on progression hazard). Unlike the Wilcoxon test, it could be performed (and yield a significant result) even without the patients in group "Stable" -- just based on the relationship between age and time to progression among the patients experiencing disease progression. So, by changing time2 values you could modify the significance of age at diagnosis substantially while the result of the Wilcoxon test, of course, would remain unchanged.
Thank you very much for the help!
One more follow up question-
For the same patient population i tried to calculate median duration of follow-up . Used proc univariate .
for the stable group median follow up from surgery to last follow-up is 2.1 years (IQR, 0.5-4.6)
for progression group, median duration of follow-up from surgery to progression is 1.6 years (IQR, 0.9- 5.7)
for overall cohort, median duration of follow-up is 2.1 years (0.7, 5.4)
The median PFS for the entire cohort was 6.06 years (95% Cl, 3.73-9.21) using the code
proc lifetest data=red.data;
time time2*progression1(0);
run;
I am wondering how do we explain median follow-up of 2.1 years and PFS of 6.06 years. How is PFS so large when median follow up is only 2.1 years.
Thank you very much in advance. Really appreciate all your help!
You're welcome.
@Kyra wrote:
I am wondering how do we explain median follow-up of 2.1 years and PFS of 6.06 years. How is PFS so large when median follow up is only 2.1 years.
This is just the effect of right-censoring. Note that the entire "Stable" group (i.e. the majority of patients, 53 of 95) is regarded as censored, so their Kaplan-Meier estimates have a substantial increasing effect on the estimated median progression-free survival. The discrepancy between median follow-up and estimated median PFS would be smaller if only a few patients were censored and it would vanish if no censoring had occurred. You can see this by including fewer and fewer "Stable" patients in the calculations. In the extreme case of restricting both calculations to the "Progression" group, i.e., using
where progression1=1;
the two medians will be identical.
Thank you very much for the reply! I am very very grateful to you and the community!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.