<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Using array to identify a date and make a new variable from identified date in SAS Programming</title>
    <link>https://communities.sas.com/t5/SAS-Programming/Using-array-to-identify-a-date-and-make-a-new-variable-from/m-p/754180#M237777</link>
    <description>&lt;P&gt;Hello all,&lt;/P&gt;&lt;P&gt;I am working with a dataset in which I care mostly about 3 variables: pos_test1--pos_test50 (positive test result), complete_date and collect_date1--collect_date50. All variables are unique to the individual, and &amp;amp;do_n is a macro currently set to 50 (collect_date1-collect_date50, pos_test1-pos_test50). I want to identify the closest date of the collect_date variable range with a positive test result to to the complete_date. Currently, it is accurately selecting that ANY positive test linked with a collect_date variable exists, but not matching the date correctly. Here is the code that I am using:&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;data want;
data have;

array pos_test [&amp;amp;do_n] pos_test1-pos_test&amp;amp;do_n;
array collect_date [&amp;amp;do_n] collect_date1-collect_date&amp;amp;do_n;

do i = 1 to &amp;amp;do_n;&lt;BR /&gt;
if pos_test[i] =1 then do;

closest=collect_date[1];
diff = (collect_date[1]-complete_date);


if collect_date[i] &amp;gt; complete_date then do;

if . &amp;lt; (collect_date[i]-complete_date) &amp;lt; diff then closest=collect_date[i];

end;
end;
end;

format closest mmddyy10.;

run;&lt;/CODE&gt;&lt;/PRE&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;It returns the correct individual cases,&amp;nbsp;&lt;EM&gt;however,&lt;/EM&gt; it does not return the correct date. It always returns collect_date1. My intention is to select the matching pair of pos_test and collect_date. For example a test on 5/12/21 had a positive result, but 5/20/21 had a negative. This code is identifying that the 5/12/21 positive test exists, and including that case in the selected output, but is still outputting the "closest" variable as 5/20/21.&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;I hope this makes sense. Please feel free to ask clarifying questions and thank you in advance for the help.&amp;nbsp;&lt;/P&gt;</description>
    <pubDate>Wed, 14 Jul 2021 20:48:12 GMT</pubDate>
    <dc:creator>katemarieb</dc:creator>
    <dc:date>2021-07-14T20:48:12Z</dc:date>
    <item>
      <title>Using array to identify a date and make a new variable from identified date</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Using-array-to-identify-a-date-and-make-a-new-variable-from/m-p/754180#M237777</link>
      <description>&lt;P&gt;Hello all,&lt;/P&gt;&lt;P&gt;I am working with a dataset in which I care mostly about 3 variables: pos_test1--pos_test50 (positive test result), complete_date and collect_date1--collect_date50. All variables are unique to the individual, and &amp;amp;do_n is a macro currently set to 50 (collect_date1-collect_date50, pos_test1-pos_test50). I want to identify the closest date of the collect_date variable range with a positive test result to to the complete_date. Currently, it is accurately selecting that ANY positive test linked with a collect_date variable exists, but not matching the date correctly. Here is the code that I am using:&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;data want;
data have;

array pos_test [&amp;amp;do_n] pos_test1-pos_test&amp;amp;do_n;
array collect_date [&amp;amp;do_n] collect_date1-collect_date&amp;amp;do_n;

do i = 1 to &amp;amp;do_n;&lt;BR /&gt;
if pos_test[i] =1 then do;

closest=collect_date[1];
diff = (collect_date[1]-complete_date);


if collect_date[i] &amp;gt; complete_date then do;

if . &amp;lt; (collect_date[i]-complete_date) &amp;lt; diff then closest=collect_date[i];

end;
end;
end;

format closest mmddyy10.;

run;&lt;/CODE&gt;&lt;/PRE&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;It returns the correct individual cases,&amp;nbsp;&lt;EM&gt;however,&lt;/EM&gt; it does not return the correct date. It always returns collect_date1. My intention is to select the matching pair of pos_test and collect_date. For example a test on 5/12/21 had a positive result, but 5/20/21 had a negative. This code is identifying that the 5/12/21 positive test exists, and including that case in the selected output, but is still outputting the "closest" variable as 5/20/21.&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;I hope this makes sense. Please feel free to ask clarifying questions and thank you in advance for the help.&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Wed, 14 Jul 2021 20:48:12 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Using-array-to-identify-a-date-and-make-a-new-variable-from/m-p/754180#M237777</guid>
      <dc:creator>katemarieb</dc:creator>
      <dc:date>2021-07-14T20:48:12Z</dc:date>
    </item>
    <item>
      <title>Re: Using array to identify a date and make a new variable from identified date</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Using-array-to-identify-a-date-and-make-a-new-variable-from/m-p/754186#M237779</link>
      <description>&lt;P&gt;It's hard to really investigate such questions without sample data and fully functional code to actually execute and replicate what you describe.&lt;/P&gt;
&lt;P&gt;From the looks of it and your description may-be the issue is caused by you using a value of &lt;EM&gt;&lt;STRONG&gt;1&lt;/STRONG&gt;&lt;/EM&gt; instead of variable &lt;EM&gt;&lt;STRONG&gt;i&lt;/STRONG&gt;&lt;/EM&gt; as array index in below code section.&lt;/P&gt;
&lt;PRE&gt;closest=collect_date[1];
diff = (collect_date[1]-complete_date);&lt;/PRE&gt;</description>
      <pubDate>Wed, 14 Jul 2021 20:56:20 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Using-array-to-identify-a-date-and-make-a-new-variable-from/m-p/754186#M237779</guid>
      <dc:creator>Patrick</dc:creator>
      <dc:date>2021-07-14T20:56:20Z</dc:date>
    </item>
    <item>
      <title>Re: Using array to identify a date and make a new variable from identified date</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Using-array-to-identify-a-date-and-make-a-new-variable-from/m-p/754204#M237785</link>
      <description>&lt;BLOCKQUOTE&gt;&lt;HR /&gt;&lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/12447"&gt;@Patrick&lt;/a&gt;&amp;nbsp;wrote:&lt;BR /&gt;
&lt;P&gt;It's hard to really investigate such questions without sample data and fully functional code to actually execute and replicate what you describe.&lt;/P&gt;
&lt;P&gt;From the looks of it and your description may-be the issue is caused by you using a value of &lt;EM&gt;&lt;STRONG&gt;1&lt;/STRONG&gt;&lt;/EM&gt; instead of variable &lt;EM&gt;&lt;STRONG&gt;i&lt;/STRONG&gt;&lt;/EM&gt; as array index in below code section.&lt;/P&gt;
&lt;PRE&gt;closest=collect_date[1];
diff = (collect_date[1]-complete_date);&lt;/PRE&gt;
&lt;HR /&gt;&lt;/BLOCKQUOTE&gt;
&lt;P&gt;I've never made that mistake in 30+ years of programming. Honest! &lt;span class="lia-unicode-emoji" title=":smiling_face_with_horns:"&gt;😈&lt;/span&gt;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Wed, 14 Jul 2021 22:15:30 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Using-array-to-identify-a-date-and-make-a-new-variable-from/m-p/754204#M237785</guid>
      <dc:creator>ballardw</dc:creator>
      <dc:date>2021-07-14T22:15:30Z</dc:date>
    </item>
    <item>
      <title>Re: Using array to identify a date and make a new variable from identified date</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Using-array-to-identify-a-date-and-make-a-new-variable-from/m-p/754315#M237841</link>
      <description>&lt;P&gt;I completely understand the difficulty, apologies! It is a protected dataset, but it looks something like this once in SAS and cleaned:&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;TABLE border="1"&gt;&lt;TBODY&gt;&lt;TR&gt;&lt;TD&gt;&amp;nbsp;&lt;/TD&gt;&lt;TD&gt;Pos_test1&lt;/TD&gt;&lt;TD&gt;Pos_test2&lt;/TD&gt;&lt;TD&gt;Pos_test3&lt;/TD&gt;&lt;TD&gt;Complete_date&lt;/TD&gt;&lt;TD&gt;Closest&lt;/TD&gt;&lt;TD&gt;Collect_date1&lt;/TD&gt;&lt;TD&gt;Collect_date2&lt;/TD&gt;&lt;TD&gt;Collect_date3&lt;/TD&gt;&lt;/TR&gt;&lt;TR&gt;&lt;TD&gt;ID1&lt;/TD&gt;&lt;TD&gt;1&lt;/TD&gt;&lt;TD&gt;1&lt;/TD&gt;&lt;TD&gt;.&lt;/TD&gt;&lt;TD&gt;5/21/2021&lt;/TD&gt;&lt;TD&gt;6/20/2021&lt;/TD&gt;&lt;TD&gt;6/20/2021&lt;/TD&gt;&lt;TD&gt;6/10/2021&lt;/TD&gt;&lt;TD&gt;.&lt;/TD&gt;&lt;/TR&gt;&lt;TR&gt;&lt;TD&gt;ID2&lt;/TD&gt;&lt;TD&gt;1&lt;/TD&gt;&lt;TD&gt;.&lt;/TD&gt;&lt;TD&gt;.&lt;/TD&gt;&lt;TD&gt;3/2/2021&lt;/TD&gt;&lt;TD&gt;5/10/2021&lt;/TD&gt;&lt;TD&gt;5/10/2021&lt;/TD&gt;&lt;TD&gt;.&lt;/TD&gt;&lt;TD&gt;.&lt;/TD&gt;&lt;/TR&gt;&lt;TR&gt;&lt;TD&gt;ID3&lt;/TD&gt;&lt;TD&gt;0&lt;/TD&gt;&lt;TD&gt;0&lt;/TD&gt;&lt;TD&gt;1&lt;/TD&gt;&lt;TD&gt;6/4/2021&lt;/TD&gt;&lt;TD&gt;7/10/2021&lt;/TD&gt;&lt;TD&gt;7/10/2021&lt;/TD&gt;&lt;TD&gt;6/20/2021&lt;/TD&gt;&lt;TD&gt;6/22/2021&lt;/TD&gt;&lt;/TR&gt;&lt;/TBODY&gt;&lt;/TABLE&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Maybe this helps to visualize. Each individual can have up to 50 tests, but many have only 1, or just a few.&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Thank you for the tip about the 1, an easy oversight! However, that was intentional. Oftentimes the collect_date1 is the date desired, so I set the baseline at collect_date1. Then, in later steps, I attempted to write that IF any other collect_date[2-50] are closer in difference from the complete_date than collect_date1, that should be chosen as "closest" instead.&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;The issue with the current code is that for ID1, it chooses collect_date1 as the closest date to the complete_date, instead of collect_date2. ID2 is correct by default, but then the issue returns with ID3. It should ignore collect_date1, because pos_test1 is 0. However, it accurately recognizes that collect_date3 and pos_test3 should be included, and then it returns the value of collect_date1 as the closest value.&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;I'm not sure why it isn't overwriting the original closest variable when it finds a closest date. Maybe I don't need the placeholder of the first 2 collect_date1 lines? I will try that to see if it fixes it.&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Thu, 15 Jul 2021 13:03:22 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Using-array-to-identify-a-date-and-make-a-new-variable-from/m-p/754315#M237841</guid>
      <dc:creator>katemarieb</dc:creator>
      <dc:date>2021-07-15T13:03:22Z</dc:date>
    </item>
    <item>
      <title>Re: Using array to identify a date and make a new variable from identified date</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Using-array-to-identify-a-date-and-make-a-new-variable-from/m-p/754382#M237876</link>
      <description>&lt;P&gt;Here is a data set of your "example" data for others to use as a start.&lt;/P&gt;
&lt;PRE&gt;data have;
  informat id $5. pos_test1-Pos_test3 f1. 
           Complete_date	Closest	Collect_date1	Collect_date2	Collect_date3 mmddyy10.;
  format Complete_date	Closest	Collect_date1	Collect_date2	Collect_date3 mmddyy10.;
  input id 	Pos_test1	Pos_test2	Pos_test3	Complete_date	Closest	Collect_date1	Collect_date2	Collect_date3;
datalines;
ID1	1	1	.	5/21/2021	6/20/2021	6/20/2021	6/10/2021	.
ID2	1	.	.	3/2/2021	5/10/2021	5/10/2021	.	        .
ID3	0	0	1	6/4/2021	7/10/2021	7/10/2021	6/20/2021	6/22/2021
;&lt;/PRE&gt;
&lt;P&gt;Now a question/comment: Why is the Collectdate_1 later then Collect_date2 but Collect_date_3 is later than 2 but before 1?&lt;/P&gt;
&lt;P&gt;I think that if your collect_dates were in chronologic order (with the positive values correlating) this would be an extremely simple exercise.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;As it is, you could use another array that holds the differences and search that.&lt;/P&gt;
&lt;P&gt;If I understand your requirement this may be one way.&lt;/P&gt;
&lt;PRE&gt;data want;
   set have;
   array pt (*) pos_test: ;
   array cd (*) collect_date: ;
   /* to hold the differences, since new variables need explicit size*/
   array dd (3) ; 
   do i=1 to dim(pt);
      if pt[i]=1 then dd[i]=(abs(cd[i] - complete_date));
   end;
   if min(of dd(*))&amp;gt; 0 then pos = whichn(min(of dd(*)),of dd(*));
   if pos&amp;gt;0 then new_closest= cd[pos];&lt;BR /&gt;   format new_closest mmddyy10.;
run;&lt;/PRE&gt;
&lt;P&gt;The dd array holds the date differences only when the test is positive. Absolute values are used to get "closest".&lt;/P&gt;
&lt;P&gt;The Whichn, and character version Whichc, returns the first position that the value in the first position is found occurs in a list of following values. The "if min(of " bit of code is to only try to find a minimum position when there is at least one positive test.&lt;/P&gt;
&lt;P&gt;I also create a new_closest so you can actually compare with the current value.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;I really didn't understand why you were restricting&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Thu, 15 Jul 2021 16:27:10 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Using-array-to-identify-a-date-and-make-a-new-variable-from/m-p/754382#M237876</guid>
      <dc:creator>ballardw</dc:creator>
      <dc:date>2021-07-15T16:27:10Z</dc:date>
    </item>
  </channel>
</rss>

