I am trying to figure out how to compute prediction probabilities (cumulative incidence) from a competing risks Cox model via PROC PHREG. I know I can use the BASELINE statement and obtain them using the CIF keyword, but I want to be able to do this outside of that procedure (either in a different program or application like Excel).
To do this in a cause-specific model, one can use the BASELINE statement to obtain baseline survival estimates (S0) by setting all covariates to 0 and/or their reference levels. Then use that to compute survival probabilities at a specific point in time:
What I would like to know is if there is an analogous process for competing risks?
I have tried the following with some degree of success ---
Set all covariates to 0 and/or their reference levels to get a baseline cumulative incidence estimate using the CIF keyword. Then use that baseline cumulative incidence estimate to derive the predicted cumulative incidence as follows:
F(t) = 1 – exp(-exp(X*beta)*CIF0), where CIF0 = baseline cumulative incidence estimate?
I have tested this with real data and I come close to the predicted probability SAS produces. However, my computed probability is never exactly the same as SAS. It is always low by <.01. This could be rounding error on my part but I am not sure. I was hoping someone in the community can confirm what I am doing is accurate and/or direct me towards the correct way to calculate these probabilities.
Thank you for reading this. Your help is appreciated!
Thanks Jacob for your response. Yes, I am doing exactly how you describe. However, I do not want to have to rely on the baseline statement to produce the predicted probabilities. I should have stated my question more clearly. I want to take the beta estimates that I get from the competing risk model and obtain predicted probabilities manually...without using the baseline statement in PHREG.
I have been close using the formula I described in my original post but not exact. I believe I figured it out last night, though. After obtaining CIF0(t), the baseline probability for time point t obtained from setting all covariates to zero and/or reference level, one needs to get at the cumulative subdistribution hazard Λ10(t). This can be done by setting Λ10(t) = -ln(1-CIF0(t)). That cumulative subdistribution hazard is what is needed and the formula for prediction is: F(t) = 1 – exp(-exp(X*beta)*Λ10(t)). I applied this formula within Excel and was able to match exactly the probabilities SAS was providing via the baseline statement. The paper by Ying So et al, called "Using the PHREG Procedure to Analyze Competing-Risks Data" was very helpful in this regard.
hello,
yes, it is possible to get the cumulative incidence functions in a competing risk model. The Fine & Gray method gives you want you want, and it is implemented in the most recent release. It is quite easy. Your censoring variable should also indicate what type of event that occur, and the eventcode option in the model statement is used for specifying the type of event of interest.
The modelline should be something like this, assuming that eventtype "1" is the event of interest.
model T*Status(0)= X1-X5 / eventcode=1;
The status variable here can take other values than 0 (censoring) and 1(event of interest).
And with the baseline statement you get the probability funciton out in a dataset with the CIF= keyword.
Thanks Jacob for your response. Yes, I am doing exactly how you describe. However, I do not want to have to rely on the baseline statement to produce the predicted probabilities. I should have stated my question more clearly. I want to take the beta estimates that I get from the competing risk model and obtain predicted probabilities manually...without using the baseline statement in PHREG.
I have been close using the formula I described in my original post but not exact. I believe I figured it out last night, though. After obtaining CIF0(t), the baseline probability for time point t obtained from setting all covariates to zero and/or reference level, one needs to get at the cumulative subdistribution hazard Λ10(t). This can be done by setting Λ10(t) = -ln(1-CIF0(t)). That cumulative subdistribution hazard is what is needed and the formula for prediction is: F(t) = 1 – exp(-exp(X*beta)*Λ10(t)). I applied this formula within Excel and was able to match exactly the probabilities SAS was providing via the baseline statement. The paper by Ying So et al, called "Using the PHREG Procedure to Analyze Competing-Risks Data" was very helpful in this regard.
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.
Find more tutorials on the SAS Users YouTube channel.