BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
_MooMoo
Obsidian | Level 7

Hello! 

 

I am trying to generate survival time using Weibull distribution for simulation study. I got the survival time equation for Weibull distribution from Bender: https://epub.ub.uni-muenchen.de/1716/1/paper_338.pdf. That is, 

_MooMoo_2-1599628457722.png

I am trying to confirm survival time calculated from above equation gives the somewhat similar result using RAND('WEIBULL') procedure and the sample SAS code I wrote is below. Here, I am assuming B effect is null so that exp(B'x)=1:

 _MooMoo_0-1599628242520.png

If I am correct, both X and W would give somewhat similar answer. And when I looked at the mean of X and W value, they are quite different. Below is the output...

_MooMoo_1-1599628357818.png

I am unsure why this is happening.

 

Any kinds of advice are welcomed! Thank you!

1 ACCEPTED SOLUTION

Accepted Solutions
FreelanceReinh
Jade | Level 19

Hello @_MooMoo,

 

The reason for the discrepancy is that the formula for the Weibull PDF implemented in the RAND function uses a different parameterization than the article. Your definition of W should read

W=rand('WEIBULL',0.9,0.035**(-1/0.9));

to take this into account.

 

(As you noted correctly, the minus sign in the last term of the article's formula must be inside the parentheses. Also the term after the first equals sign is incorrect: The factor 1/l must be inside the square brackets.)

View solution in original post

9 REPLIES 9
FreelanceReinh
Jade | Level 19

Hello @_MooMoo,

 

The reason for the discrepancy is that the formula for the Weibull PDF implemented in the RAND function uses a different parameterization than the article. Your definition of W should read

W=rand('WEIBULL',0.9,0.035**(-1/0.9));

to take this into account.

 

(As you noted correctly, the minus sign in the last term of the article's formula must be inside the parentheses. Also the term after the first equals sign is incorrect: The factor 1/l must be inside the square brackets.)

_MooMoo
Obsidian | Level 7

Hi @FreelanceReinh !

 

Aha, that explains why. Thank you for your clarification!

 

Another question... If I want to generate survival time that baseline hazard rate is 0.035 and its monotone decrease rate is at 0.1, then can I set the scale parameter (lambda) as 0.035 and shape parameter (v) as 1-0.1=0.9 and plug those numbers into Bender's equation?

 

The point I am missing is that how to set parameters in Weibull distribution and which procedure (Bender's equation or RAND function) to use in order to generate survival time? 

 

Thanks in advance!

FreelanceReinh
Jade | Level 19

The hazard function of a Weibull random variable with scale parameter lambda and shape parameter v (using the article's parameterization) is h(t)=v*lambda*t**(v-1). As you can see, with 0<v<1 the hazard rate goes to infinity for t → 0. In particular, with lambda=0.035 and v=0.9, it's monotonically decreasing and drops below 0.035 at t=0.3486784...

 

If you want a different hazard function, maybe one with h(0)=0.035, you need to define it and then go on and derive the survival function from that (by integration and exponentiation). As with the Weibull distribution chances are that we can simulate suitable survival times using SAS functions and don't need the technique suggested in the article.

_MooMoo
Obsidian | Level 7

Thank you for your response once again @FreelanceReinh.

 

Could you explain further how can we use SAS functions to simulate suitable survival time? 

My plan was to set lambda as a baseline hazard and V as decreasing rate of hazard (e.g., hazard assumed to be decreased by 10% over time). And plug in those two parameters in the equation that article suggested. 

FreelanceReinh
Jade | Level 19

@_MooMoo wrote:

My plan was to set lambda as a baseline hazard and V as decreasing rate of hazard (e.g., hazard assumed to be decreased by 10% over time). And plug in those two parameters in the equation that article suggested. 


My first interpretation of this description would result in h(t)=lambda*exp(-v*t), which equals lambda for t=0 (in this sense "baseline"), and with v=-log(0.9) this function decreases by 10% per time unit, i.e., h(t+1)/h(t)=0.9 for all t. However, this function does not describe the hazard rate for any distribution of survival times because S(t)=exp(lambda/v*(exp(-v*t)-1)) does not go to zero if t goes to infinity, hence is not a survival function. So, you need to define a hazard function h(t)=... with a precise formula (or characterize it unambiguously by mathematical criteria) which leads to a valid survival function S.

MichelleR0
Fluorite | Level 6

May I ask how you got the parameter values 0.9 and 0.035?  I am trying to simulate survival time (time to first event, in days)  but I am unclear what parameters to specify.  Any guidance would help.  Thank you.

FreelanceReinh
Jade | Level 19

Hello @MichelleR0,

 

These values were from @_MooMoo's initial post. I don't know more about them.

 

For a realistic simulation I would probably look at studies or books from the same subject area to find out what kind of parametric models are commonly used there. Let's say, most of them use the Weibull distribution. Then I would think of suitable characteristics, e.g., the expected lifetime or quantiles of the distribution (typically as many as there are parameters to be determined: two in the Weibull case), set them to realistic values and solve the resulting equations for the distribution (here: Weibull) parameters. From the simulated data using these parameters I would compute the same characteristics and expect that, on average, they would tend to be close to the arbitrary values that I started with.

MichelleR0
Fluorite | Level 6
Thank you for your reply. I simulating data informed by data from a clinical trial. I cannot use the real data but I can based the simulated data on the real parameters of the dataset. I've read the SAS documentation on data simulation using rand function and have done so successfully for continuous variables with normal distribution and binary and categorical data. However, I cannot figure out how to determine what distribution the time to event data is and when I searched, it described that survival data falls under a log-weibull distribution.
FreelanceReinh
Jade | Level 19

@MichelleR0 wrote:
(...) it described that survival data falls under a log-weibull distribution.

So, according to https://en.wikipedia.org/wiki/Gumbel_distribution, the survival times can be simulated using the RAND('GUMBEL', ms) function and explicit formulas exist for several characteristics including mean, median and standard deviation. Hence, the two unknown parameters m and s can be determined as outlined in my previous post. For example, starting with suitable values for mean and median, the parameters would be simply obtained as the solution of two linear equations.

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

What is ANOVA?

ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 9 replies
  • 2481 views
  • 5 likes
  • 3 in conversation