BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
leesw0212
Calcite | Level 5

leesw0212_1-1598213381248.png

Based on SAS User's Guide, it uses inverse probability weighting of the Kaplan-Meier estimator of the censoring distribution at the time point just before X_i which is G(X_i-)^(-2).

 

leesw0212_2-1598213410051.png

However, based on Uno's paper "On the C-Statistics for Evaluating Overall Adequacy of Risk Prediction Procedures with Censored Survival Data", it used the exact time point for the Kaplan-Meier, G(X_i)^(-2). Depending on which time point I use, it would give a slightly different value of the C-index.

I am not sure why SAS used the time point just before X_i for the Kaplan-Meier estimator which is G(X_i-)^(-2) because the paper used the exact time point X_i, which is G(X_i)^(-2), for the inverse probability weighting of the Kaplan-Meier estimator.

Should I use the exact time point or the time point just before X_i?

1 ACCEPTED SOLUTION

Accepted Solutions
SteveDenham
Jade | Level 19

I think the most probable reason is that if you specified the exact time, the KM estimator would include any values observed up to, but not including the next time point. KM estimators are based on "open ended" intervals at the upper end, so I suspect the paper you are citing either used a 'closed end' cutoff (unlikely), or indexed the interval by the endpoint, rather than the beginning point.  That is probably OK, unless you have just a single interval, 

 

SteveDenham

View solution in original post

5 REPLIES 5
leesw0212
Calcite | Level 5

leesw0212_0-1598161527861.png

Based on SAS User's Guide, it uses inverse probability weighting of the Kaplan-Meier estimator of the censoring distribution at the time point just before X_i.

However, based on Uno's paper "On the C-Statistics for Evaluating Overall Adequacy of Risk Prediction Procedures with Censored Survival Data", it used the exact time point for the Kaplan-Meier. Depending on which time point I use, it would give a slightly different value of the C-index.

Should I use the exact time point or the time point just before X_i?

Rick_SAS
SAS Super FREQ

I haven't read the paper, but I can interpret the formula that you posted. If tau is the time of interest, the formula says to use the time points that strictly precede tau.  Ths indicator variable I(X_i < tau) is zero when you use data equal to or greater than tau.

leesw0212
Calcite | Level 5

Thank you so much for your reply!

I understood that Tau is the upper limit of the time interval. 

However, the question I have is why SAS used the time point just before X_i for the Kaplan-Meier estimator which is G(X_i-)^(-2). The paper used the exact time point X_i, which is G(X_i)^(-2), for the inverse probability weighting of the Kaplan-Meier estimator.

 

SteveDenham
Jade | Level 19

I think the most probable reason is that if you specified the exact time, the KM estimator would include any values observed up to, but not including the next time point. KM estimators are based on "open ended" intervals at the upper end, so I suspect the paper you are citing either used a 'closed end' cutoff (unlikely), or indexed the interval by the endpoint, rather than the beginning point.  That is probably OK, unless you have just a single interval, 

 

SteveDenham

leesw0212
Calcite | Level 5

Thank you so much for your reply!

That's very helpful 😄

Ready to join fellow brilliant minds for the SAS Hackathon?

Build your skills. Make connections. Enjoy creative freedom. Maybe change the world. Registration is now open through August 30th. Visit the SAS Hackathon homepage.

Register today!
What is ANOVA?

ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 5 replies
  • 696 views
  • 0 likes
  • 3 in conversation