- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Hello,
I'm encountering a discrepancy between the median survival times calculated using the quartiles method and the Kaplan-Meier product limit estimates in PROC LIFETEST. Here's a brief overview of my issue:
I am analysing survival data using PROC LIFETEST with the following SAS code:
proc lifetest data=adtte00 method=km conftype=LOGLOG plots=survival outsurv=survesti;
time avalM * CNSR(1);
strata trt;
by paramcd;
ods output ProductLimitEstimates=KM_Estimates CensoredSummary=stat00 quartiles=med00 homtests=logrank00;
run;
When I check the output, I notice that the median survival time from the quartiles
output (dataset med00
) does not match the median calculated from the Kaplan-Meier product limit estimates (dataset KM Estimates). Specifically:
- Quartiles Output (med00) The median survival time is provided directly as the 50th percentile. (11.5811)
- Product Limit Estimates (KM Estimates) The closest time point where the survival probability crosses below 0.5 is used, leading to a slightly different median value. (11.3347)
Here’s an example of the output I’m seeing:
Quartiles Output:
data med00;
input Paramcd $ Stratum trt $ Percent Estimate Transform $ LowerLimit UpperLimit;
datalines;
TTD 1 trt1 50 11.5811 LOGLOG 10.4148 14.7844
TTD 2 trt2 50 4.5010 LOGLOG 3.5483 5.9138
;
run;
proc print data=med00;
run;
Product Limit Estimates:
data KM_Estimates;
input Paramcd $ Stratum trt $ avalm Censor Survival Failure StdErr Failed Left;
datalines;
TTD 1 trt1 11.2361 0 0.5112 0.4888 0.0375 86 92
TTD 1 trt1 11.3018 0 0.5056 0.4944 0.0375 88 90
TTD 1 trt1 11.3347 0 0.5000 0.5000 0.0375 89 89
TTD 1 trt1 11.8275 0 0.4944 0.5056 0.0375 90 88
TTD 1 trt1 12.4189 0 0.4888 0.5112 0.0375 91 87
;
run;
proc print data=KM_Estimates;
run;
Log:
25 GOPTIONS ACCESSIBLE; 26 ods output ProductLimitEstimates=KM_Estimates CensoredSummary=stat00 quartiles=med00 homtests= logrank00; 27 proc lifetest data=adtte00 method= km conftype=LOGLOG plots=survival outsurv = survesti; 28 time avalM * CNSR(1); 29 strata trt; 30 by paramcd; 31 run; NOTE: Graph's name, LIFETEST, changed to LIFETES5. LIFETEST is already used or not a valid SAS name. NOTE: 19648 bytes written to /saswork/SAS_workA34500001BB1_dsprgn05.ds-grid.com/SAS_workD3CA00001BB1_dsprgn05.ds-grid.com/lifetest5.png. NOTE: The above message was for the following BY group: Parameter Code=TTD NOTE: The data set WORK.LOGRANK00 has 3 observations and 5 variables. NOTE: The data set WORK.MED00 has 6 observations and 8 variables. NOTE: The data set WORK.STAT00 has 3 observations and 8 variables. NOTE: The data set WORK.KM_ESTIMATES has 271 observations and 10 variables. NOTE: The data set WORK.SURVESTI has 235 observations and 8 variables. NOTE: PROCEDURE LIFETEST used (Total process time): real time 0.90 seconds cpu time 0.84 seconds
Why is there a discrepancy between the median survival times calculated using the quartiles method and the product limit estimates?. Are there specific steps I can take to ensure consistent median calculations between these methods?. Any insights or guidance on resolving this discrepancy would be greatly appreciated!
Accepted Solutions
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Hello @smackerz1988,
@smackerz1988 wrote:
Why is there a discrepancy between the median survival times calculated using the quartiles method and the product limit estimates?
I don't see a discrepancy. Your KM_Estimates output shows that the 89th of the 178 survival times in stratum trt1 is 11.3347 and the 90th is 11.8275. Hence, all real numbers t in the closed interval [11.3347, 11.8275] qualify as "crude" (empirical) median survival times because the survival times of at least 50% (i.e., 89) of the subjects are ≤t and the survival times of at least 50% of the subjects are ≥t.
In this situation it is common to define the midpoint of that interval, (11.3347 + 11.8275)/2 = 11.5811 as the median (cf. the default QNTLDEF=5 option of PROC MEANS). This is analogous to what PROC LIFETEST does with the Kaplan-Meier estimates Ŝ(t) of the survival function: see the example for the first quartile in section Breslow, Fleming-Harrington, and Kaplan-Meier Methods of the documentation, where it says: "If Ŝ(t) is exactly equal to 0.75 from tj to tj+1, the first quartile is taken to be (tj + tj+1)/2." Which explains why you got 11.5811 as the point estimate for the median in your quartiles output med00, given that Ŝ(t) is exactly equal to 0.5 from 11.3347 to 11.8275.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Please share the LOG from running your Proc Lifetest code, include the code and any notes or messages generated.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
@ballardw Post Updated!.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
@Reeza I've updated KM_Estimates table for range of values around 50% percentile
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
https://communities.sas.com/t5/Statistical-Procedures/bd-p/statistical_procedures
My opinion is since these are two different ways to calculate median, you would not expect to get the same result.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Hello @smackerz1988,
@smackerz1988 wrote:
Why is there a discrepancy between the median survival times calculated using the quartiles method and the product limit estimates?
I don't see a discrepancy. Your KM_Estimates output shows that the 89th of the 178 survival times in stratum trt1 is 11.3347 and the 90th is 11.8275. Hence, all real numbers t in the closed interval [11.3347, 11.8275] qualify as "crude" (empirical) median survival times because the survival times of at least 50% (i.e., 89) of the subjects are ≤t and the survival times of at least 50% of the subjects are ≥t.
In this situation it is common to define the midpoint of that interval, (11.3347 + 11.8275)/2 = 11.5811 as the median (cf. the default QNTLDEF=5 option of PROC MEANS). This is analogous to what PROC LIFETEST does with the Kaplan-Meier estimates Ŝ(t) of the survival function: see the example for the first quartile in section Breslow, Fleming-Harrington, and Kaplan-Meier Methods of the documentation, where it says: "If Ŝ(t) is exactly equal to 0.75 from tj to tj+1, the first quartile is taken to be (tj + tj+1)/2." Which explains why you got 11.5811 as the point estimate for the median in your quartiles output med00, given that Ŝ(t) is exactly equal to 0.5 from 11.3347 to 11.8275.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Thanks @FreelanceReinh !