BookmarkSubscribeRSS Feed
☑ This topic is solved. Need further help from the community? Please sign in and ask a new question.
kc
Quartz | Level 8 kc
Quartz | Level 8

I am simulating survival data to mimic an already existing dataset. The details of the original dataset are as follows - 

1. 2 treatment groups

2. Patient follow up until at least 5 years. 

3. Event rate at 5 years is around 50% for both groups at the end of 5 years.

4. An additional 10% (trt 1) and 25% (trt 2) patients dropped out of the study before the full 5Y follow-up.

 

I am using the Weibull Shape (1.0147) and Weibull Scale (6.4465) parameters from the SAS output of proc lifereg procedure run separately by group to simulate the data. The survival time (in years) is capped at 5y before running the procedure.

 

The proc lifereg code is as follows:

 

proc lifereg data=surv;
where group=1;
model surv_time_years*event(0) = / dist=Weibull;
run;

 

I am using the following line of code: Time=rand('Weibull', 1.0147, 6.4465) to simulate data (with same number of patients) for trt 1. 

 

Although the overall mean time of simulated (3.46) vs. original data (3.45) for trt 1 is almost the same, issue is that the simulation is overestimating the number of patients completing 5 years. Around 10% more patients in the simulated dataset have time >= 5y, with some really extreme values not seen in the original dataset. As a consequence, the number of patients completing 1,2,3,4,5 years in both simulated vs. original data is completely different.

 

I followed other posts on the forum and tried removing the censoring variable 'event(0)' from the model statement when estimating Weibull parameters as suggested in one post, but with no luck. Also, I am in possession of Rick Wicklin's 'Simulating Data with SAS' text and have gone through the relevant sections on simulating survival data.

 

I tried using other distributions (comparing model fit using AIC/BIC from proc lifereg) to simulate the data resulting in similar or even worse results.

 

I understand this is a simulation and some amount of variation is expected, but, I feel I am missing something here.

 

Any help is appreciated. 

1 ACCEPTED SOLUTION

Accepted Solutions
FreelanceReinh
Jade | Level 19

Hello @kc,

 

How did you implement item 4 -- the drop-outs -- in your simulation? In the PharmaSUG 2004 paper Statistical Simulations for Sample Size Calculation with PROC IML the author considers the distributions of three times:

  • from study start to enrollment
  • from enrollment to the event of interest
  • from enrollment to drop-out.

View solution in original post

6 REPLIES 6
sbxkoenk
SAS Super FREQ

Hello,

 

Maybe @Rick_SAS can help?

It's Belgian National holiday over here , so I may not rack my brain 😁 .

 

Cheers,

Koen

kc
Quartz | Level 8 kc
Quartz | Level 8
No worries. I will look at the suggestions from @Rick_SAS and @FreelanceReinhard.
FreelanceReinh
Jade | Level 19

Hello @kc,

 

How did you implement item 4 -- the drop-outs -- in your simulation? In the PharmaSUG 2004 paper Statistical Simulations for Sample Size Calculation with PROC IML the author considers the distributions of three times:

  • from study start to enrollment
  • from enrollment to the event of interest
  • from enrollment to drop-out.
kc
Quartz | Level 8 kc
Quartz | Level 8
Thank you. Will take a closer look at the paper.
Rick_SAS
SAS Super FREQ

Proc LIFEREG uses a different parameterization for the Weibull distribution, as compared with the RAND function and PROC UNIVARIATE. You can read about the difference and how to convert one set of parameters into the others: Interpret estimates for a Weibull regression model in SAS

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

What is ANOVA?

ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 6 replies
  • 1166 views
  • 6 likes
  • 4 in conversation