<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: Pick random values conditional to other variables in SAS Programming</title>
    <link>https://communities.sas.com/t5/SAS-Programming/Pick-random-values-conditional-to-other-variables/m-p/568153#M159883</link>
    <description>&lt;P&gt;Thank you Reeza! It's such a thorough approach!&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Year of diagnosis is known for all records (dx_year). The goal is to impute missing month and/or missing day within the known year of diagnosis. Therefore, the first possible date is the beginning of the year of diagnosis instead the year of censoring.&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Sorry for not including only-day missing scenario in the mock data. In the inclusive data, ID=3 was censored (03/26/2005) prior to a diagnosis (04/99/2005).&lt;/P&gt;</description>
    <pubDate>Sat, 22 Jun 2019 19:03:13 GMT</pubDate>
    <dc:creator>Cruise</dc:creator>
    <dc:date>2019-06-22T19:03:13Z</dc:date>
    <item>
      <title>Pick random values conditional to other variables</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Pick-random-values-conditional-to-other-variables/m-p/568099#M159857</link>
      <description>&lt;P&gt;Hi Folks:&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;I'm trying to create random dates to replace incomplete dates with missing month and/or day. Year fields are complete. I used %Rand macro from Rick's tutorial&amp;nbsp;&lt;A href="https://blogs.sas.com/content/iml/2015/10/05/random-integers-sas.html" target="_blank" rel="noopener"&gt;https://blogs.sas.com/content/iml/2015/10/05/random-integers-sas.html&lt;/A&gt;. Conditions for the random selection include dates be not after censoring date and must be within the known year of diagnosis.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Could you please look at my SAS code below on the mock data where I created RAND_DX. Any critiques or suggestions or improvements appreciated to achieve my programming objective I stated above.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Thank you for taking your time in this.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;My variables:&lt;/P&gt;
&lt;P&gt;Date (dx_month, dx_day and dx_year)&lt;/P&gt;
&lt;P&gt;Censor (censor_month, censor_day, censor_year)&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Assumptions/facts:&lt;/P&gt;
&lt;OL&gt;
&lt;LI&gt;All records have ‘year’ of diagnosis (no missing in the year of diagnosis)&lt;/LI&gt;
&lt;LI&gt;‘day’ field is unknown when ‘month’ field is unknown&lt;/LI&gt;
&lt;LI&gt;‘day’ field can be unknown when ‘month’ field is known &amp;nbsp;&lt;/LI&gt;
&lt;LI&gt;Effect of leap year and February is negligible&lt;/LI&gt;
&lt;LI&gt;Variable ‘censor’ is complete&lt;/LI&gt;
&lt;/OL&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;DATA HAVE;
INPUT ID dx_year dx_month dx_day CENSOR CENSOR_DAY CENSOR_MONTH CENSOR_YEAR;
CARDS;
1	2005	99	99	20088	31	12	2014
2	2005	99	99	17308	22	5	2007
3	2005	99	99	16521	26	3	2005
4	2005	99	99	17074	30	9	2006
5	2005	99	99	16620	3	7	2005
6	2005	99	99	16709	30	9	2005
7	2005	99	99	16901	10	4	2006
8	2005	99	99	16777	7	12	2005
9	2005	99	99	16763	23	11	2005
10	2005	99	99	17750	6	8	2008
11	2005	99	99	16621	4	7	2005
12	2005	99	99	20088	31	12	2014
13	2005	99	99	16636	19	7	2005
14	2005	99	99	16685	6	9	2005
15	2005	99	99	16778	8	12	2005
16	2005	99	99	17129	24	11	2006
17	2005	99	99	20088	31	12	2014
18	2005	99	99	17057	13	9	2006
19	2005	99	99	16486	19	2	2005
20	2005	99	99	16548	22	4	2005
;

%macro RandBetween(min, max);
   (&amp;amp;min + floor((1+&amp;amp;max-&amp;amp;min)*rand("uniform")))
%mend;

DATA HAVE1; SET HAVE; 
RAND_DX_MONTH = %RandBetween(1,CENSOR_MONTH); 
RAND_DX_DAY= %RandBetween(1,CENSOR_DAY);
RAND_DX=MDY(RAND_DX_MONTH, RAND_DX_DAY, DX_YEAR); 
run; &lt;/CODE&gt;&lt;/PRE&gt;
&lt;P&gt;&amp;nbsp; &lt;/P&gt;</description>
      <pubDate>Sat, 22 Jun 2019 19:04:26 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Pick-random-values-conditional-to-other-variables/m-p/568099#M159857</guid>
      <dc:creator>Cruise</dc:creator>
      <dc:date>2019-06-22T19:04:26Z</dc:date>
    </item>
    <item>
      <title>Re: Pick random values conditional to other variables</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Pick-random-values-conditional-to-other-variables/m-p/568103#M159860</link>
      <description>&lt;P&gt;None of the records in your sample data set meet this sitatuion/assumption:&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;BLOCKQUOTE&gt;&lt;HR /&gt;&lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/132289"&gt;@Cruise&lt;/a&gt;&amp;nbsp;wrote:&lt;BR /&gt;
&lt;P&gt;Hi Folks:&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;I'm trying to create random dates to replace incomplete dates with missing month and/or day. Year fields are complete. I used %Rand macro from Rick's tutorial&amp;nbsp;&lt;A href="https://blogs.sas.com/content/iml/2015/10/05/random-integers-sas.html" target="_blank" rel="noopener"&gt;https://blogs.sas.com/content/iml/2015/10/05/random-integers-sas.html&lt;/A&gt;. Conditions for the random selection include dates be not after censoring date and must be within the known year.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Could you please look at my SAS code below on the mock data where I created RAND_DX. Any critiques or suggestions or improvements appreciated to achieve my programming objective I stated above.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Thank you for taking your time in this.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;My variables:&lt;/P&gt;
&lt;P&gt;Date (dx_month, dx_day and dx_year)&lt;/P&gt;
&lt;P&gt;Censor (censor_month, censor_day, censor_year)&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Assumptions/facts:&lt;/P&gt;
&lt;OL&gt;
&lt;LI&gt;All records have ‘year’ of date (no missing)&lt;/LI&gt;
&lt;LI&gt;‘day’ field is unknown when ‘month’ field is unknown&lt;/LI&gt;
&lt;LI&gt;&lt;FONT color="#FF0000"&gt;&lt;STRONG&gt;‘day’ field can be unknown when ‘month’ field is known &amp;nbsp;&lt;/STRONG&gt;&lt;/FONT&gt;&lt;/LI&gt;
&lt;LI&gt;Effect of leap year and February is negligible&lt;/LI&gt;
&lt;LI&gt;Variable ‘censor’ is complete&lt;/LI&gt;
&lt;/OL&gt;
&lt;HR /&gt;&lt;/BLOCKQUOTE&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Sat, 22 Jun 2019 02:20:40 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Pick-random-values-conditional-to-other-variables/m-p/568103#M159860</guid>
      <dc:creator>Reeza</dc:creator>
      <dc:date>2019-06-22T02:20:40Z</dc:date>
    </item>
    <item>
      <title>Re: Pick random values conditional to other variables</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Pick-random-values-conditional-to-other-variables/m-p/568104#M159861</link>
      <description>&lt;P&gt;SAS has the rand('integer', ....) function now so you don't need that macro unless you're on an older version of SAS.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Dates are integers. So you can calculate the earliest possible date for the 'event' and latest possible date and select a random value in-between them.&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Your current methodology is more restrictive than this approach, for example, you can't have the event be within the same month as the censoring, which is possible. You may have a longer gap required based on what exactly you're measuring - diagnosis to censoring may be always at least 60 days or something like that.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;For the day portion of the date, what happens if the day of censoring is the first of the month? From your current method that would leave a single day that can be chosen. It does not need to be less than the day, this is a parameter that needs some relationship to the month as well otherwise it's adding an additional restriction that may influence your data but not be accurate at all.&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;I haven't bothered to set any seeds, I'll leave that to you. I added in a check below to ensure that the event date is always before the censor date.&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;DATA HAVE;
    INPUT ID dx_year dx_month dx_day CENSOR CENSOR_DAY CENSOR_MONTH CENSOR_YEAR;

    CARDS;
1  2005    12  99  20088   31  12  2014
2  2005    1  99  17308   22  5   2007
3  2005    4  99  16521   26  3   2005
4  2005    99  99  17074   30  9   2006
5  2005    99  99  16620   3   7   2005
6  2005    99  99  16709   30  9   2005
7  2005    99  99  16901   10  4   2006
8  2005    99  99  16777   7   12  2005
9  2005    99  99  16763   23  11  2005
10 2005    99  99  17750   6   8   2008
11 2005    99  99  16621   4   7   2005
12 2005    99  99  20088   31  12  2014
13 2005    99  99  16636   19  7   2005
14 2005    99  99  16685   6   9   2005
15 2005    99  99  16778   8   12  2005
16 2005    99  99  17129   24  11  2006
17 2005    99  99  20088   31  12  2014
18 2005    99  99  17057   13  9   2006
19 2005    99  99  16486   19  2   2005
20 2005    99  99  16548   22  4   2005
;

data want;
    set have;
    
    *calculate censor date;
    censor_date=mdy(censor_month, censor_day, censor_year);

    *handle missing values for both day and month;
    if dx_month = 99 and dx_day = 99 then
        do;
            *first possible date is the beginning of the year;
            first_possible_date=intnx('year', censor_date, 0, 'b');
            *last possible date is the day before censoring;
            last_possible_date=censor_date - 1;
            *event date is a random date between these dates;
            event_date=rand('integer', first_possible_date, last_possible_date);
            *assign day and months as needed;
            dx_month=month(event_date);
            dx_day=day(event_date);
        end;
    else if dx_day eq 99 and dx_month ne 99 then
        do;
            *first possible date is beginning of month indicated;
            first_possible_date=mdy(censor_month, 1, censor_year);
            *last possible date is last of month;
            *EXCEPT if month is same as censor month;
            *in that case, use the earlier of the two dates;
            last_possible_date=min(intnx('month', first_possible_date, 0, 'e'), 
                                   censor_date - 1);
            *calculate a random event date;
            event_date=rand('integer', first_possible_date, last_possible_date);
            *assign day as needed;
            dx_day=day(event_date);
        end;
    else
        event_date=mdy(dx_month, dx_day, dx_year);
    *format dates for display and legibility;
    format censor event_date first_possible_date last_possible_date date9.;
    
    check = censor_date - event_date;
run;&lt;/CODE&gt;&lt;/PRE&gt;
&lt;P&gt;PS.....if this is going to be published I'm going to want my name on the paper &lt;span class="lia-unicode-emoji" title=":winking_face:"&gt;😉&lt;/span&gt;&lt;/P&gt;</description>
      <pubDate>Sat, 22 Jun 2019 02:47:30 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Pick-random-values-conditional-to-other-variables/m-p/568104#M159861</guid>
      <dc:creator>Reeza</dc:creator>
      <dc:date>2019-06-22T02:47:30Z</dc:date>
    </item>
    <item>
      <title>Re: Pick random values conditional to other variables</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Pick-random-values-conditional-to-other-variables/m-p/568153#M159883</link>
      <description>&lt;P&gt;Thank you Reeza! It's such a thorough approach!&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Year of diagnosis is known for all records (dx_year). The goal is to impute missing month and/or missing day within the known year of diagnosis. Therefore, the first possible date is the beginning of the year of diagnosis instead the year of censoring.&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Sorry for not including only-day missing scenario in the mock data. In the inclusive data, ID=3 was censored (03/26/2005) prior to a diagnosis (04/99/2005).&lt;/P&gt;</description>
      <pubDate>Sat, 22 Jun 2019 19:03:13 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Pick-random-values-conditional-to-other-variables/m-p/568153#M159883</guid>
      <dc:creator>Cruise</dc:creator>
      <dc:date>2019-06-22T19:03:13Z</dc:date>
    </item>
    <item>
      <title>Re: Pick random values conditional to other variables</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Pick-random-values-conditional-to-other-variables/m-p/568156#M159886</link>
      <description>&lt;P&gt;&lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/13879"&gt;@Reeza&lt;/a&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Below is the closest I got using your approach. Ignore, negative survival for ID=3. Please comment if you find another problem.&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;data have1; set have;
if dx_month = 99 and dx_day = 99 then fake_dx=mdy(1,1,dx_year);
else 
if dx_day eq 99 and dx_month ne 99 then
fake_dx=mdy(dx_month,1,dx_year);
run; 

data want;
    set have1; 
    *calculate censor date;
    censor_date=mdy(censor_month, censor_day, censor_year);

    *handle missing values for both day and month;
    if dx_month = 99 and dx_day = 99 then
        do;
   *first possible date is the beginning of the known year of diagnosis;
              first_possible_date=intnx('year', fake_dx, 0, 'b');
   *last possible date is the earlier of the end of the known year of diagnosis or censoring date;
	     last_possible_date=min(intnx('year', first_possible_date, 0, 'e'), 
                                   censor_date - 1);
   *event date is a random date between these dates;
              event_date=rand('integer', first_possible_date, last_possible_date);
   *assign day and months as needed;
              dx_month=month(event_date);
              dx_day=day(event_date);
        end;
    else if dx_day eq 99 and dx_month ne 99 then
        do;
            *first possible date is beginning of the known month of diagnosis;
            first_possible_date=mdy(dx_month, 1, dx_year);
            *last possible date is last day of the of month;
            *EXCEPT if month is same as censor month;
            *in that case, use the earlier of the two dates;
            last_possible_date=min(intnx('month', first_possible_date, 0, 'e'), 
                                   censor_date - 1);
            *calculate a random event date;
            event_date=rand('integer', first_possible_date, last_possible_date);
            *assign day as needed;
            dx_day=day(event_date);
        end;
    else
        event_date=mdy(dx_month, dx_day, dx_year);
    *format dates for display and legibility;
    format censor event_date first_possible_date last_possible_date fake_dx date9.;
    check = censor_date - event_date;
run;

proc print data=want(drop=censor_:);
run;
&lt;/CODE&gt;&lt;/PRE&gt;</description>
      <pubDate>Sat, 22 Jun 2019 19:45:42 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Pick-random-values-conditional-to-other-variables/m-p/568156#M159886</guid>
      <dc:creator>Cruise</dc:creator>
      <dc:date>2019-06-22T19:45:42Z</dc:date>
    </item>
  </channel>
</rss>

