Solved: Re: Confidence intervals for survival probability estimates with proc ...

Haemoglobin17 · Posted 07-07-2023 10:00 AM

Dear all,

I have been conducting a cohort study and I would like to have the survival probability estimates with confidence intervals in a dataset to export in order to create a failure curve with R.

With the following proc phreg code I output a dataset with the value of survival probability, but can't do the same with confidence intervals. Can you help me please. I didn't find a specific solution on the internet.. Thank you very much!

proc phreg data=Dataset;
class group (ref="0");
model followup*event(0)=
/ ties=efron rl;
weight sw/ normalize;
strata group;
output out=Dataset_Surv=s;
run;

FreelanceReinh · Posted 07-07-2023 11:09 AM

Hello @Haemoglobin17,

I think you can use the BASELINE statement instead of the OUTPUT statement:

baseline out=want survival=s l=lcl u=ucl;

Dataset WANT will contain the survival probability estimates (variable S) together with the confidence limits (LCL, UCL).

View solution in original post

FreelanceReinh · Posted 07-07-2023 11:09 AM

Hello @Haemoglobin17,

I think you can use the BASELINE statement instead of the OUTPUT statement:

baseline out=want survival=s l=lcl u=ucl;

Dataset WANT will contain the survival probability estimates (variable S) together with the confidence limits (LCL, UCL).

Haemoglobin17 · Posted 07-07-2023 11:56 AM

It is exactly what I was looking for! I didn't know the baseline statement, thank you @FreelanceReinh !!

Haemoglobin17 · Posted 07-07-2023 12:13 PM

I noticed that when I use the statement baseline the total number of observations is 3900, while with output is 4 millions. What's the reason? The log doesn't give me errors or warnings. I tried to re-run the program with output and it comes back to 4 millions. Is there a way to fix this?

FreelanceReinh · Posted 07-07-2023 01:11 PM

@Haemoglobin17 wrote:

I noticed that when I use the statement baseline the total number of observations is 3900, while with output is 4 millions. What's the reason? The log doesn't give me errors or warnings. I tried to re-run the program with output and it comes back to 4 millions. Is there a way to fix this?

The output dataset from the BASELINE statement is "condensed" in that it contains each survival probability estimate (variable S) only once (per group). If you have tied event times (i.e. duplicate values of variable FOLLOWUP within a group) or censored observations (EVENT=0), where S doesn't change, the dataset created by the OUTPUT statement will contain the corresponding observations, all with the same S value. This redundancy is avoided in the BASELINE output dataset.

If you sort the 4-million-observation output dataset NODUPKEY by GROUP S, the resulting dataset should have very close to 3900 observations, the remaining discrepancies, if any, being "trivial" observations with FOLLOWUP=0 & S=1. No non-trivial values of S should be lost (i.e., be unavailable from the BASELINE statement).

You could merge the LCL and UCL values from the BASELINE output dataset to the large output dataset in a DATA step if you need them redundantly "multiplied" as well.

Haemoglobin17 · Posted 07-07-2023 01:26 PM

If I leave the repeated values of effect estimates, does this changes the survival/failure curve?

FreelanceReinh · Posted 07-07-2023 02:03 PM

If the S values for two different FOLLOWUP times (in the same group) are equal, then at least one of the observations should be a censored observation. These points correspond to a constant ("flat") part of the estimated survival curve. Duplicate S values for two or more observations with the same FOLLOWUP time correspond to only one point of the survival curve.

To draw the survival curve, you don't need the redundant duplicate S values. (You need the censored times for the markers indicating censored observations, though.) Between the points defined by the event times and the corresponding unique S values the curve is just flat.

Confidence intervals for survival probability estimates with proc phreg

Re: Confidence intervals for survival probability estimates with proc phreg

Re: Confidence intervals for survival probability estimates with proc phreg

Re: Confidence intervals for survival probability estimates with proc phreg

Re: Confidence intervals for survival probability estimates with proc phreg

Re: Confidence intervals for survival probability estimates with proc phreg

Re: Confidence intervals for survival probability estimates with proc phreg

Re: Confidence intervals for survival probability estimates with proc phreg

Catch up on SAS Innovate 2026