Turn on suggestions

Auto-suggest helps you quickly narrow down your search results by suggesting possible matches as you type.

Showing results for

- Home
- /
- Programming
- /
- Programming
- /
- assigning a random index-date

Options

- RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Mute
- Printer Friendly Page

🔒 This topic is **solved** and **locked**.
Need further help from the community? Please
sign in and ask a **new** question.

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

Posted 01-12-2018 11:12 AM
(2208 views)

I have the following dataset

```
group ptid dx_dt index_dt
0 1 1/1/2005 .
0 2 2/2/2005 .
0 3 1/2/2007 .
0 4 3/3/2006 .
1 5 1/24/2005 2/24/2005
1 6 1/6/2005 1/15/2005
1 7 4/2/2007 4/29/2007
1 8 3/3/2006 3/9/2006
```

I have the following data with patient in the group "0" without index_date but only diagnosis date and those in group "1" with index_Date and diagnosis date. I want to generate a random index date for group "0" based on the information obtained from group "1"

any help is much appreciated

1 ACCEPTED SOLUTION

Accepted Solutions

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

I explicitly noted that issue in the comments and am leaving that up to you 🙂

You could either:

1. If less than 0, set to 0 and assume same day

2. Use a different distribution that won't produce negative numbers

3. Repeatedly generate random numbers until you get a non-negative value

4. Some other options that I likely haven't thought of that makes sense given your data....

8 REPLIES 8

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

Does this need to scale beyond your data here? Are there multiple groups and such that need to be accounted for?

In this scenario it's relatively simple but if it needs to be generalized it would help to know that ahead of time.

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

taking into account that all of the randomly generated date (index-date) for the control group should occur after the diagnosis-date for that particular id.

This wasn't included in your original criteria.

Are you looking to replicate the duration, ie time from diagnosis to index date, or replicate the dates.

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

I would assume a normal or some type of distribution then on the duration.

So I would calculate the duration for the data I have, and the mean and standard deviation.

Then simulate a value using a Random Normal distribution and add it to the date you have.

Here's an example using Normal distribution. You'll need to check that its after the dx_dt or account for that condition.

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

Thank you Reeza but the code produce duration with negative value meaning some of the index_date happen before the dx_dt, can you help correct the error

```
data have;
input group patid dx_dt: mmddyy10. index_dt: mmddyy10. ;
format dx_dt index_dt mmddyy10.;
cards;
0 1 1/1/2005 .
0 2 2/2/2005 .
0 3 1/2/2007 .
0 4 3/3/2006 .
1 5 1/24/2005 2/24/2005
1 6 1/6/2005 1/15/2005
1 7 4/2/2007 4/29/2007
1 8 3/3/2006 3/9/2006
;;;;
run;
data have;
set have;
duration= index_dt-dx_dt;run;
proc sql noprint;
select mean(duration), std(duration)
into: dur_avg, :dur_std
from have
where not missing(duration);
quit;
%put &dur_avg;
%out &dur_std;
data want;
set have;
if missing(index_dt) then
duration= rand('normal', &dur_avg, &dur_std);
index_dt= floor(dx_dt +duration);
format index_dt dx_dt date9.;
run;
```

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

I explicitly noted that issue in the comments and am leaving that up to you 🙂

You could either:

1. If less than 0, set to 0 and assume same day

2. Use a different distribution that won't produce negative numbers

3. Repeatedly generate random numbers until you get a non-negative value

4. Some other options that I likely haven't thought of that makes sense given your data....

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

Thank you for the prompt reply!

**Available on demand!**

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.