SAS Programming

DATA Step, Macro, Functions and more
BookmarkSubscribeRSS Feed
☑ This topic is solved. Need further help from the community? Please sign in and ask a new question.
smackerz1988
Pyrite | Level 9

Hello,

 

I'm encountering a discrepancy between the median survival times calculated using the quartiles method and the Kaplan-Meier product limit estimates in PROC LIFETEST. Here's a brief overview of my issue:

 

I am analysing survival data using PROC LIFETEST  with the following SAS code:

proc lifetest data=adtte00 method=km conftype=LOGLOG plots=survival outsurv=survesti;
   time avalM * CNSR(1);
   strata trt; 
   by paramcd;
   ods output ProductLimitEstimates=KM_Estimates CensoredSummary=stat00 quartiles=med00 homtests=logrank00;
run;

When I check the output, I notice that the median survival time from the quartiles output (dataset med00) does not match the median calculated from the Kaplan-Meier product limit estimates (dataset KM Estimates). Specifically:

  • Quartiles Output (med00) The median survival time is provided directly as the 50th percentile. (11.5811)
  • Product Limit Estimates (KM Estimates) The closest time point where the survival probability crosses below 0.5 is used, leading to a slightly different median value. (11.3347)

Here’s an example of the output I’m seeing:

Quartiles Output:

data med00;
    input Paramcd $ Stratum trt $ Percent Estimate Transform $ LowerLimit UpperLimit;
    datalines;
TTD 1 trt1 50 11.5811 LOGLOG 10.4148 14.7844
TTD 2 trt2 50 4.5010 LOGLOG 3.5483 5.9138
;
run;

proc print data=med00;
run;

Product Limit Estimates:

data KM_Estimates;
    input Paramcd $ Stratum trt $ avalm Censor Survival Failure StdErr Failed Left;
    datalines;
TTD 1 trt1 11.2361 0 0.5112 0.4888 0.0375 86 92
TTD 1 trt1 11.3018 0 0.5056 0.4944 0.0375 88 90
TTD 1 trt1 11.3347 0 0.5000 0.5000 0.0375 89 89
TTD 1 trt1 11.8275 0 0.4944 0.5056 0.0375 90 88
TTD 1 trt1 12.4189 0 0.4888 0.5112 0.0375 91 87
;
run;

proc print data=KM_Estimates;
run;

Log:

25         GOPTIONS ACCESSIBLE;
26         ods output ProductLimitEstimates=KM_Estimates CensoredSummary=stat00 quartiles=med00 homtests= logrank00;
27         proc lifetest data=adtte00 method= km conftype=LOGLOG plots=survival outsurv = survesti;
28            time avalM * CNSR(1);
29            strata trt;
30            by paramcd;
31         run;

NOTE: Graph's name, LIFETEST, changed to LIFETES5. LIFETEST is already used or not a valid SAS name.
NOTE: 19648 bytes written to 
      /saswork/SAS_workA34500001BB1_dsprgn05.ds-grid.com/SAS_workD3CA00001BB1_dsprgn05.ds-grid.com/lifetest5.png.
NOTE: The above message was for the following BY group:
      Parameter Code=TTD
NOTE: The data set WORK.LOGRANK00 has 3 observations and 5 variables.
NOTE: The data set WORK.MED00 has 6 observations and 8 variables.
NOTE: The data set WORK.STAT00 has 3 observations and 8 variables.
NOTE: The data set WORK.KM_ESTIMATES has 271 observations and 10 variables.
NOTE: The data set WORK.SURVESTI has 235 observations and 8 variables.
NOTE: PROCEDURE LIFETEST used (Total process time):
      real time           0.90 seconds
      cpu time            0.84 seconds

 

Why is there a discrepancy between the median survival times calculated using the quartiles method and the product limit estimates?. Are there specific steps I can take to ensure consistent median calculations between these methods?. Any insights or guidance on resolving this discrepancy would be greatly appreciated!

 

 

 

1 ACCEPTED SOLUTION

Accepted Solutions
FreelanceReinh
Jade | Level 19

Hello @smackerz1988,

 


@smackerz1988 wrote:

Why is there a discrepancy between the median survival times calculated using the quartiles method and the product limit estimates?


I don't see a discrepancy. Your KM_Estimates output shows that the 89th of the 178 survival times in stratum trt1 is 11.3347 and the 90th is 11.8275. Hence, all real numbers t in the closed interval [11.3347, 11.8275] qualify as "crude" (empirical) median survival times because the survival times of at least 50% (i.e., 89) of the subjects are ≤t and the survival times of at least 50% of the subjects are ≥t.

 

In this situation it is common to define the midpoint of that interval, (11.3347 + 11.8275)/2 = 11.5811 as the median (cf. the default QNTLDEF=5 option of PROC MEANS). This is analogous to what PROC LIFETEST does with the Kaplan-Meier estimates Ŝ(t) of the survival function: see the example for the first quartile in section Breslow, Fleming-Harrington, and Kaplan-Meier Methods of the documentation, where it says: "If Ŝ(t) is exactly equal to 0.75 from tj to tj+1, the first quartile is taken to be (tj + tj+1)/2." Which explains why you got 11.5811 as the point estimate for the median in your quartiles output med00, given that Ŝ(t) is exactly equal to 0.5 from 11.3347 to 11.8275.

View solution in original post

7 REPLIES 7
ballardw
Super User

Please share the LOG from running your Proc Lifetest code, include the code and any notes or messages generated.

Reeza
Super User
What is the data point just prior to the 50% percentile on the KM_Estimates table? Being a step function I think it takes a less than, not less than or equal to calculation so it's likely the prior data point.

smackerz1988
Pyrite | Level 9

@Reeza  I've updated KM_Estimates table for range of values around 50% percentile

Ksharp
Super User
You'd better post it at Stat Forum:
https://communities.sas.com/t5/Statistical-Procedures/bd-p/statistical_procedures

My opinion is since these are two different ways to calculate median, you would not expect to get the same result.
FreelanceReinh
Jade | Level 19

Hello @smackerz1988,

 


@smackerz1988 wrote:

Why is there a discrepancy between the median survival times calculated using the quartiles method and the product limit estimates?


I don't see a discrepancy. Your KM_Estimates output shows that the 89th of the 178 survival times in stratum trt1 is 11.3347 and the 90th is 11.8275. Hence, all real numbers t in the closed interval [11.3347, 11.8275] qualify as "crude" (empirical) median survival times because the survival times of at least 50% (i.e., 89) of the subjects are ≤t and the survival times of at least 50% of the subjects are ≥t.

 

In this situation it is common to define the midpoint of that interval, (11.3347 + 11.8275)/2 = 11.5811 as the median (cf. the default QNTLDEF=5 option of PROC MEANS). This is analogous to what PROC LIFETEST does with the Kaplan-Meier estimates Ŝ(t) of the survival function: see the example for the first quartile in section Breslow, Fleming-Harrington, and Kaplan-Meier Methods of the documentation, where it says: "If Ŝ(t) is exactly equal to 0.75 from tj to tj+1, the first quartile is taken to be (tj + tj+1)/2." Which explains why you got 11.5811 as the point estimate for the median in your quartiles output med00, given that Ŝ(t) is exactly equal to 0.5 from 11.3347 to 11.8275.

sas-innovate-wordmark-2025-midnight.png

Register Today!

Join us for SAS Innovate 2025, our biggest and most exciting global event of the year, in Orlando, FL, from May 6-9. Sign up by March 14 for just $795.


Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 7 replies
  • 1447 views
  • 5 likes
  • 5 in conversation