Hello,
I'm encountering a discrepancy between the median survival times calculated using the quartiles method and the Kaplan-Meier product limit estimates in PROC LIFETEST. Here's a brief overview of my issue:
I am analysing survival data using PROC LIFETEST with the following SAS code:
proc lifetest data=adtte00 method=km conftype=LOGLOG plots=survival outsurv=survesti;
time avalM * CNSR(1);
strata trt;
by paramcd;
ods output ProductLimitEstimates=KM_Estimates CensoredSummary=stat00 quartiles=med00 homtests=logrank00;
run;
When I check the output, I notice that the median survival time from the quartiles
output (dataset med00
) does not match the median calculated from the Kaplan-Meier product limit estimates (dataset KM Estimates). Specifically:
Here’s an example of the output I’m seeing:
Quartiles Output:
data med00;
input Paramcd $ Stratum trt $ Percent Estimate Transform $ LowerLimit UpperLimit;
datalines;
TTD 1 trt1 50 11.5811 LOGLOG 10.4148 14.7844
TTD 2 trt2 50 4.5010 LOGLOG 3.5483 5.9138
;
run;
proc print data=med00;
run;
Product Limit Estimates:
data KM_Estimates;
input Paramcd $ Stratum trt $ avalm Censor Survival Failure StdErr Failed Left;
datalines;
TTD 1 trt1 11.2361 0 0.5112 0.4888 0.0375 86 92
TTD 1 trt1 11.3018 0 0.5056 0.4944 0.0375 88 90
TTD 1 trt1 11.3347 0 0.5000 0.5000 0.0375 89 89
TTD 1 trt1 11.8275 0 0.4944 0.5056 0.0375 90 88
TTD 1 trt1 12.4189 0 0.4888 0.5112 0.0375 91 87
;
run;
proc print data=KM_Estimates;
run;
Log:
25 GOPTIONS ACCESSIBLE; 26 ods output ProductLimitEstimates=KM_Estimates CensoredSummary=stat00 quartiles=med00 homtests= logrank00; 27 proc lifetest data=adtte00 method= km conftype=LOGLOG plots=survival outsurv = survesti; 28 time avalM * CNSR(1); 29 strata trt; 30 by paramcd; 31 run; NOTE: Graph's name, LIFETEST, changed to LIFETES5. LIFETEST is already used or not a valid SAS name. NOTE: 19648 bytes written to /saswork/SAS_workA34500001BB1_dsprgn05.ds-grid.com/SAS_workD3CA00001BB1_dsprgn05.ds-grid.com/lifetest5.png. NOTE: The above message was for the following BY group: Parameter Code=TTD NOTE: The data set WORK.LOGRANK00 has 3 observations and 5 variables. NOTE: The data set WORK.MED00 has 6 observations and 8 variables. NOTE: The data set WORK.STAT00 has 3 observations and 8 variables. NOTE: The data set WORK.KM_ESTIMATES has 271 observations and 10 variables. NOTE: The data set WORK.SURVESTI has 235 observations and 8 variables. NOTE: PROCEDURE LIFETEST used (Total process time): real time 0.90 seconds cpu time 0.84 seconds
Why is there a discrepancy between the median survival times calculated using the quartiles method and the product limit estimates?. Are there specific steps I can take to ensure consistent median calculations between these methods?. Any insights or guidance on resolving this discrepancy would be greatly appreciated!
Hello @smackerz1988,
@smackerz1988 wrote:
Why is there a discrepancy between the median survival times calculated using the quartiles method and the product limit estimates?
I don't see a discrepancy. Your KM_Estimates output shows that the 89th of the 178 survival times in stratum trt1 is 11.3347 and the 90th is 11.8275. Hence, all real numbers t in the closed interval [11.3347, 11.8275] qualify as "crude" (empirical) median survival times because the survival times of at least 50% (i.e., 89) of the subjects are ≤t and the survival times of at least 50% of the subjects are ≥t.
In this situation it is common to define the midpoint of that interval, (11.3347 + 11.8275)/2 = 11.5811 as the median (cf. the default QNTLDEF=5 option of PROC MEANS). This is analogous to what PROC LIFETEST does with the Kaplan-Meier estimates Ŝ(t) of the survival function: see the example for the first quartile in section Breslow, Fleming-Harrington, and Kaplan-Meier Methods of the documentation, where it says: "If Ŝ(t) is exactly equal to 0.75 from tj to tj+1, the first quartile is taken to be (tj + tj+1)/2." Which explains why you got 11.5811 as the point estimate for the median in your quartiles output med00, given that Ŝ(t) is exactly equal to 0.5 from 11.3347 to 11.8275.
Please share the LOG from running your Proc Lifetest code, include the code and any notes or messages generated.
@ballardw Post Updated!.
@Reeza I've updated KM_Estimates table for range of values around 50% percentile
Hello @smackerz1988,
@smackerz1988 wrote:
Why is there a discrepancy between the median survival times calculated using the quartiles method and the product limit estimates?
I don't see a discrepancy. Your KM_Estimates output shows that the 89th of the 178 survival times in stratum trt1 is 11.3347 and the 90th is 11.8275. Hence, all real numbers t in the closed interval [11.3347, 11.8275] qualify as "crude" (empirical) median survival times because the survival times of at least 50% (i.e., 89) of the subjects are ≤t and the survival times of at least 50% of the subjects are ≥t.
In this situation it is common to define the midpoint of that interval, (11.3347 + 11.8275)/2 = 11.5811 as the median (cf. the default QNTLDEF=5 option of PROC MEANS). This is analogous to what PROC LIFETEST does with the Kaplan-Meier estimates Ŝ(t) of the survival function: see the example for the first quartile in section Breslow, Fleming-Harrington, and Kaplan-Meier Methods of the documentation, where it says: "If Ŝ(t) is exactly equal to 0.75 from tj to tj+1, the first quartile is taken to be (tj + tj+1)/2." Which explains why you got 11.5811 as the point estimate for the median in your quartiles output med00, given that Ŝ(t) is exactly equal to 0.5 from 11.3347 to 11.8275.
Thanks @FreelanceReinh !
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.