BookmarkSubscribeRSS Feed
SAS_User
Calcite | Level 5

I would like to understand what it means (in the clinical sense and from a statistics perspective) when you run a K-M analysis by stratifying on the censoring variable. E.g.

Proc lifetest;

time PFS*censor (1);

strata censor ;

run;

I am asking this question because the stratifying variable in my analysis is highly correlated with the censor variable. I would like to gather enough talking points to discourage the approach. It would be great to hear your thoughts and opinions. 

 

Thank you!

3 REPLIES 3
Reeza
Super User

It depends on what you're analyzing or trying to measure.

If you're looking at survival times, that's exactly why you use censoring/survival analysis. 

 

Not including the censored records shows a very different pattern, which is why it's dangerous to only use 'complete' information. 

To get a genuine picture of the situation you also need to account for the fact that followup is incomplete or whatever reason there is for censoring. There's nothing wrong with 'looking' at it this way, but its not how you analyze the information to understand your survival times. It also matters how much censoring you have...if the censoring is more than 25% you start to have questionable results in my experience. If the censored observations show a different pattern that's even more concerning because it means there's some difference in survival and why these records are being censored would be of interest. It usually means there's something wrong with the treatment or service. If 30% can't complete the treatment protocol and you analyze looking only at people who complete a treatment protocol that can be very misleading. 

 

 

JacobSimonsen
Barite | Level 11

From a mathematical point of view, it doesnt make sense to stratify on the cencor variable. That is because it introducing some conditioning on future. The Kaplan Meier curve will show probabilities that an event happens before time t conditionen that the event happens some time in future before the person is censored. It will surely lead to overestimation of the probability that events happens.

SAS_User
Calcite | Level 5

Thanks to both of you for the response and explanation

SAS Innovate 2025: Save the Date

 SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!

Save the date!

What is ANOVA?

ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 3 replies
  • 1895 views
  • 1 like
  • 3 in conversation