turn on suggestions

Auto-suggest helps you quickly narrow down your search results by suggesting possible matches as you type.

Showing results for

Find a Community

- Home
- /
- SAS Programming
- /
- General Programming
- /
- simulate censored data such that most with longest...

Topic Options

- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

02-26-2013 10:25 AM

Any head on on this????? simulate censored data such that most with longest time are censored

Accepted Solutions

Solution

03-06-2013
03:30 PM

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

03-06-2013 03:30 PM

All Replies

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

02-26-2013 11:00 AM

Let's see how close this comes to what you need. It creates a variable named CENSORED, with values of 0 or 1, randomly assigned, with higher probabilities of "1" for longer TIME values.

proc sort data=have;

by time;

run;

data want;

set have nobs=_total_obs_;

censored = ranuni(12345) < _n_/_total_obs_;

run;

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

02-26-2013 11:43 AM

I do not know what you program is doing since it does not tell me anything where data=have?

This is what I have but can't fix it such that longer t are

**data** want;

lambdat = **0.0005**; *baseline hazard;

lambdac = **0.004**; *censoring hazard;

do i = **1** to **100**;

t = rand("WEIBULL", **1**, lambdaT );

* time of event;

c = rand("WEIBULL", **1**, lambdaC);

* time of censoring;

time = min(t, c); * which came first?;

censored = (c lt t);

output;

end;

**run**;

**proc** **sort** data=want;

by time;

**run**;

more likely to be consored;

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

02-26-2013 12:11 PM

OK, start with your program, but remove the line about censored = (c lt t).

After sorting, continue with my logic:

data want;

set want nobs=_total_obs_;

censored = ranuni(12345) < _n_ / _total_obs_;

run;

There are many ways to adjust the final assignment. Toward the beginning of the data, the censor rate is nearly 0, and for the final observation it is 100%. The idea is to rely on the sorted order so that the further into the data you go, the higher the time value, and the greater the likelihood of censorship. Just as one example, this formula would have the censorship rate rise to 50% instead of 100% for the highest time values:

censored = ranuni(12345) < 0.5 * _n_ / _total_obs_;

Also note, if you are performing many simulations you may want to vary the seed to the random number function. A nonzero seed always generates the same stream of random numbers.

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

02-26-2013 01:48 PM

Thanks, How do I reduce the censored rate? 50% censored rate is very high

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

02-26-2013 01:56 PM

The first number in the formula is the censored rate for the highest time value. To reduce to a 25% rate, use:

censored = ranuni(12345) < 0.25 * _n_ / _total_obs_;

The censored values are randomly assigned. But the overall censored rate for the entire data set will be roughly half the number in the formula. The rate starts at 0% and ends up at 25%, steadily increasing as you make your way through the data set.

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

02-26-2013 09:57 PM

Thanks, However this is not what I want.THis just sort the censored values and put the censored at the end without taking into consideration their time and censored time

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

02-27-2013 07:53 AM

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

02-27-2013 09:13 AM

Did you actually sort the data first? Did you check the results? You'll have to put these two items together:

1. The data set is sorted, so all the highest time values are at the end of the data set.

2. The censor rate increases as you move from the beginning to the end of the data set.

Like Steve said, if you want to change the question, just explain.

Solution

03-06-2013
03:30 PM

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

03-06-2013 03:30 PM