BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
Arick
Fluorite | Level 6

I need some help with a problem related to a previous post. Here's the situation:

I have a huge dataset (N=10000) that contains simulated event times (with right censored data) and I would like to generate a similar simulated dataset by using the estimated Weibull parameters from that dataset. First I ran PROC LIFEREG with the following model statement:

     MODEL day*cnsr(1)= / dist=Weibull;

This generated the following ML estimates:

Intercept=7.7, Scale=4.5, Weibull_Scale=2300, Weibull_Shape=0.22.

 

Then, to simulate time values in a new huge dataset, I used the SAS statements:

     ...

     call streaminit(72131);

     Day=rand("Weibull", 0.22, 2300);

     ...

But the distribution of times on this new dataset is very different. For example, the proportion with day<=365 in this new dataset is 48%, as compared to 58% in the original dataset. This discrepancy persists, even if I change the randomization seed or increase the number of simulated records.

     I found the documentation for PROC LIFEREG and the RAND function too confusing to help me reconcile this problem. 

If anyone can offer me any help, I would greatly appreciate it.

 

1 ACCEPTED SOLUTION

Accepted Solutions
Arick
Fluorite | Level 6

I have been informed that I must remove the censor term from my PROC LIFEREG statements. i.e., use

model Time = / Weibull;

   instead of

model Time*Cnsr(1) = / Weibull;

Although this does seem to work (i.e. it gives same estimates as PROC UNIVARIATE), it does surprise me that the doesn't automatically account for the censoring. At any rate, my mystery solved.

View solution in original post

3 REPLIES 3
ballardw
Super User

How did you incorporate the Intercept term?

 

As a minimum you should show the entire code of from Proc Lifereg as well as how you applied that result to your data.

Arick
Fluorite | Level 6

Dear ballardw,

     I didn't do anything with the intercept. Apologies for the ambiguity of my previous post. The SAS code attached (with comments) should provide clarity on what I am struggling to understand. In a nutshell, the PROC LIFEREG estimates for Weibull shape and scale do not seem consistent with those from PROC UNIVARIATE, which can be used to generate simulated data using the rand('Weibull', a, b) function. In other words, the Weibull parameter estimates from PROC LIFEREG don't seem to be of any use. But I suspect that I am misunderstanding or misusing the LIFEREG estimates.

     (Please note that the attached code is different that I used before, so please ignore the estimates that I quoted in my previous post. Indeed, I think you can ignore that post entirely.)

     Summary of attached SAS code: I generate a dataset, Q0, with given Weibull shape and scale. I verify that the time distribution is what I wanted. I calculated the Weibull shape and scale from Q0 in 2 ways: PROC UNIVARIATE and PROC LIFEREG. I use the latter estimates to generate a new dataset to see if the time distribution is consistent with Q0.

Arick
Fluorite | Level 6

I have been informed that I must remove the censor term from my PROC LIFEREG statements. i.e., use

model Time = / Weibull;

   instead of

model Time*Cnsr(1) = / Weibull;

Although this does seem to work (i.e. it gives same estimates as PROC UNIVARIATE), it does surprise me that the doesn't automatically account for the censoring. At any rate, my mystery solved.

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

What is ANOVA?

ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 3 replies
  • 808 views
  • 0 likes
  • 2 in conversation