Solved: Discrepancy Between Median Survival Time from Quartiles and Product Li...

smackerz1988 · Posted 07-31-2024 11:24 AM

Hello,

I'm encountering a discrepancy between the median survival times calculated using the quartiles method and the Kaplan-Meier product limit estimates in PROC LIFETEST. Here's a brief overview of my issue:

I am analysing survival data using PROC LIFETEST with the following SAS code:

proc lifetest data=adtte00 method=km conftype=LOGLOG plots=survival outsurv=survesti;
   time avalM * CNSR(1);
   strata trt; 
   by paramcd;
   ods output ProductLimitEstimates=KM_Estimates CensoredSummary=stat00 quartiles=med00 homtests=logrank00;
run;

When I check the output, I notice that the median survival time from the quartiles output (dataset med00) does not match the median calculated from the Kaplan-Meier product limit estimates (dataset KM Estimates). Specifically:

Quartiles Output (med00) The median survival time is provided directly as the 50th percentile. (11.5811)
Product Limit Estimates (KM Estimates) The closest time point where the survival probability crosses below 0.5 is used, leading to a slightly different median value. (11.3347)

Here’s an example of the output I’m seeing:

Quartiles Output:

data med00;
    input Paramcd $ Stratum trt $ Percent Estimate Transform $ LowerLimit UpperLimit;
    datalines;
TTD 1 trt1 50 11.5811 LOGLOG 10.4148 14.7844
TTD 2 trt2 50 4.5010 LOGLOG 3.5483 5.9138
;
run;

proc print data=med00;
run;

Product Limit Estimates:

data KM_Estimates;
    input Paramcd $ Stratum trt $ avalm Censor Survival Failure StdErr Failed Left;
    datalines;
TTD 1 trt1 11.2361 0 0.5112 0.4888 0.0375 86 92
TTD 1 trt1 11.3018 0 0.5056 0.4944 0.0375 88 90
TTD 1 trt1 11.3347 0 0.5000 0.5000 0.0375 89 89
TTD 1 trt1 11.8275 0 0.4944 0.5056 0.0375 90 88
TTD 1 trt1 12.4189 0 0.4888 0.5112 0.0375 91 87
;
run;

proc print data=KM_Estimates;
run;

Log:

25         GOPTIONS ACCESSIBLE;
26         ods output ProductLimitEstimates=KM_Estimates CensoredSummary=stat00 quartiles=med00 homtests= logrank00;
27         proc lifetest data=adtte00 method= km conftype=LOGLOG plots=survival outsurv = survesti;
28            time avalM * CNSR(1);
29            strata trt;
30            by paramcd;
31         run;

NOTE: Graph's name, LIFETEST, changed to LIFETES5. LIFETEST is already used or not a valid SAS name.
NOTE: 19648 bytes written to 
      /saswork/SAS_workA34500001BB1_dsprgn05.ds-grid.com/SAS_workD3CA00001BB1_dsprgn05.ds-grid.com/lifetest5.png.
NOTE: The above message was for the following BY group:
      Parameter Code=TTD
NOTE: The data set WORK.LOGRANK00 has 3 observations and 5 variables.
NOTE: The data set WORK.MED00 has 6 observations and 8 variables.
NOTE: The data set WORK.STAT00 has 3 observations and 8 variables.
NOTE: The data set WORK.KM_ESTIMATES has 271 observations and 10 variables.
NOTE: The data set WORK.SURVESTI has 235 observations and 8 variables.
NOTE: PROCEDURE LIFETEST used (Total process time):
      real time           0.90 seconds
      cpu time            0.84 seconds

Why is there a discrepancy between the median survival times calculated using the quartiles method and the product limit estimates?. Are there specific steps I can take to ensure consistent median calculations between these methods?. Any insights or guidance on resolving this discrepancy would be greatly appreciated!

FreelanceReinh · Posted 08-01-2024 06:45 AM

Hello @smackerz1988,

@smackerz1988 wrote:

Why is there a discrepancy between the median survival times calculated using the quartiles method and the product limit estimates?

I don't see a discrepancy. Your KM_Estimates output shows that the 89th of the 178 survival times in stratum trt1 is 11.3347 and the 90th is 11.8275. Hence, all real numbers t in the closed interval [11.3347, 11.8275] qualify as "crude" (empirical) median survival times because the survival times of at least 50% (i.e., 89) of the subjects are ≤t and the survival times of at least 50% of the subjects are ≥t.

In this situation it is common to define the midpoint of that interval, (11.3347 + 11.8275)/2 = 11.5811 as the median (cf. the default QNTLDEF=5 option of PROC MEANS). This is analogous to what PROC LIFETEST does with the Kaplan-Meier estimates Ŝ(t) of the survival function: see the example for the first quartile in section Breslow, Fleming-Harrington, and Kaplan-Meier Methods of the documentation, where it says: "If Ŝ(t) is exactly equal to 0.75 from t_j to t_j+1, the first quartile is taken to be (t_j + t_j+1)/2." Which explains why you got 11.5811 as the point estimate for the median in your quartiles output med00, given that Ŝ(t) is exactly equal to 0.5 from 11.3347 to 11.8275.

View solution in original post

ballardw · Posted 07-31-2024 11:59 AM

Please share the LOG from running your Proc Lifetest code, include the code and any notes or messages generated.

smackerz1988 · Posted 07-31-2024 12:10 PM

@ballardw Post Updated!.

Reeza · Posted 07-31-2024 12:20 PM

What is the data point just prior to the 50% percentile on the KM_Estimates table? Being a step function I think it takes a less than, not less than or equal to calculation so it's likely the prior data point.

smackerz1988 · Posted 07-31-2024 12:52 PM

@Reeza I've updated KM_Estimates table for range of values around 50% percentile

Ksharp · Posted 07-31-2024 10:42 PM

You'd better post it at Stat Forum:
https://communities.sas.com/t5/Statistical-Procedures/bd-p/statistical_procedures

My opinion is since these are two different ways to calculate median, you would not expect to get the same result.

FreelanceReinh · Posted 08-01-2024 06:45 AM

Hello @smackerz1988,

@smackerz1988 wrote:

Why is there a discrepancy between the median survival times calculated using the quartiles method and the product limit estimates?

I don't see a discrepancy. Your KM_Estimates output shows that the 89th of the 178 survival times in stratum trt1 is 11.3347 and the 90th is 11.8275. Hence, all real numbers t in the closed interval [11.3347, 11.8275] qualify as "crude" (empirical) median survival times because the survival times of at least 50% (i.e., 89) of the subjects are ≤t and the survival times of at least 50% of the subjects are ≥t.

In this situation it is common to define the midpoint of that interval, (11.3347 + 11.8275)/2 = 11.5811 as the median (cf. the default QNTLDEF=5 option of PROC MEANS). This is analogous to what PROC LIFETEST does with the Kaplan-Meier estimates Ŝ(t) of the survival function: see the example for the first quartile in section Breslow, Fleming-Harrington, and Kaplan-Meier Methods of the documentation, where it says: "If Ŝ(t) is exactly equal to 0.75 from t_j to t_j+1, the first quartile is taken to be (t_j + t_j+1)/2." Which explains why you got 11.5811 as the point estimate for the median in your quartiles output med00, given that Ŝ(t) is exactly equal to 0.5 from 11.3347 to 11.8275.

smackerz1988 · Posted 08-01-2024 11:13 AM

Thanks @FreelanceReinh !

Discrepancy Between Median Survival Time from Quartiles and Product Limit Estimates in PROC LIFETEST

Re: Discrepancy Between Median Survival Time from Quartiles and Product Limit Estimates in PROC LIFE

Re: Discrepancy Between Median Survival Time from Quartiles and Product Limit Estimates in PROC LIFE

Re: Discrepancy Between Median Survival Time from Quartiles and Product Limit Estimates in PROC LIFE

Re: Discrepancy Between Median Survival Time from Quartiles and Product Limit Estimates in PROC LIFE

Re: Discrepancy Between Median Survival Time from Quartiles and Product Limit Estimates in PROC LIFE

Re: Discrepancy Between Median Survival Time from Quartiles and Product Limit Estimates in PROC LIFE

Re: Discrepancy Between Median Survival Time from Quartiles and Product Limit Estimates in PROC LIFE

Re: Discrepancy Between Median Survival Time from Quartiles and Product Limit Estimates in PROC LIFE

SAS Innovate 2025: Save the Date

SAS Training: Just a Click Away