<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: Finding the closest value from two datasets in SAS Programming</title>
    <link>https://communities.sas.com/t5/SAS-Programming/Finding-the-closest-value-from-two-datasets/m-p/606152#M175980</link>
    <description>&lt;P&gt;That's fantastic. Super quick process, thanks for the solution!&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;I'm trying to understand how it works and my ability there is limited. I follow about half of what's going on.&lt;/P&gt;</description>
    <pubDate>Thu, 21 Nov 2019 14:36:06 GMT</pubDate>
    <dc:creator>davidsmarch</dc:creator>
    <dc:date>2019-11-21T14:36:06Z</dc:date>
    <item>
      <title>Finding the closest value from two datasets</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Finding-the-closest-value-from-two-datasets/m-p/604869#M175415</link>
      <description>&lt;DIV class="lia-quilt-column lia-quilt-column-04 lia-quilt-column-left lia-quilt-column-main-left"&gt;&lt;DIV class="lia-quilt-column-alley lia-quilt-column-alley-left"&gt;&lt;DIV class="lia-message-author lia-component-author"&gt;I saw some previous solutions that don't quite fit, so I though I'd try to see if I could get some help.&amp;nbsp;&lt;/DIV&gt;&lt;/DIV&gt;&lt;/DIV&gt;&lt;DIV class="lia-quilt-column lia-quilt-column-20 lia-quilt-column-right lia-quilt-column-main-right"&gt;&lt;DIV class="lia-quilt-column-alley lia-quilt-column-alley-right"&gt;&lt;DIV class="lia-message-body"&gt;&lt;DIV class="lia-message-body-content"&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;I have two datasets. One dataset (time.dat) is very long (millions of rows) and contains three variables: subject, column, and time_n in this format:&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;subj col&amp;nbsp; &amp;nbsp;time_n&lt;BR /&gt;2&amp;nbsp; &amp;nbsp; &amp;nbsp; 1&amp;nbsp; &amp;nbsp; 1564623266038&lt;BR /&gt;2&amp;nbsp; &amp;nbsp; &amp;nbsp; 2&amp;nbsp; &amp;nbsp; 1564623266044&lt;BR /&gt;2&amp;nbsp; &amp;nbsp; &amp;nbsp; 3&amp;nbsp; &amp;nbsp; 1564623266058&lt;BR /&gt;2&amp;nbsp; &amp;nbsp; &amp;nbsp; 4&amp;nbsp; &amp;nbsp; 1564623266073&lt;BR /&gt;2&amp;nbsp; &amp;nbsp; &amp;nbsp; 5&amp;nbsp; &amp;nbsp; 1564623266090&lt;BR /&gt;2&amp;nbsp; &amp;nbsp; &amp;nbsp; 6&amp;nbsp; &amp;nbsp; 1564623266106&lt;/P&gt;&lt;P&gt;.&lt;/P&gt;&lt;P&gt;.&lt;/P&gt;&lt;P&gt;.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Each subject has 1000's of columns that each have a unique time_n. There are also many subjects.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;The second set (onloadtime.dat) has all of the same subjects and three variables, subject, trial, and onloadtime_n in this format:&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;subj trial onloadtime_n&lt;BR /&gt;2&amp;nbsp; &amp;nbsp; &amp;nbsp; 1&amp;nbsp; &amp;nbsp; 1564623265779&lt;BR /&gt;2&amp;nbsp; &amp;nbsp; &amp;nbsp; 2&amp;nbsp; &amp;nbsp; 1564623272356&lt;BR /&gt;2&amp;nbsp; &amp;nbsp; &amp;nbsp; 3&amp;nbsp; &amp;nbsp; 1564623275373&lt;BR /&gt;2&amp;nbsp; &amp;nbsp; &amp;nbsp; 4&amp;nbsp; &amp;nbsp; 1564623278728&lt;/P&gt;&lt;P&gt;2&amp;nbsp; &amp;nbsp; &amp;nbsp; 5&amp;nbsp; &amp;nbsp; 1564623282137&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Each subject only has 180 onloadtime's. What I need to do is find the closest time_n to each onloadtime_n and to indicate which trial corresponds to that time. Such that the resultant data looks like:&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;subj col&amp;nbsp; &amp;nbsp; time_n&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;onloadtime_n&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;trial&lt;BR /&gt;2&amp;nbsp; &amp;nbsp; &amp;nbsp; 1&amp;nbsp; &amp;nbsp;1564623266038&amp;nbsp; &amp;nbsp; 1564623265779&amp;nbsp; &amp;nbsp; &amp;nbsp;1&amp;nbsp;&amp;nbsp;&lt;BR /&gt;2&amp;nbsp; &amp;nbsp; &amp;nbsp; 2&amp;nbsp; &amp;nbsp;1564623266044&lt;BR /&gt;2&amp;nbsp; &amp;nbsp; &amp;nbsp; 3&amp;nbsp; &amp;nbsp;1564623266058&lt;BR /&gt;2&amp;nbsp; &amp;nbsp; &amp;nbsp; 4&amp;nbsp; &amp;nbsp;1564623266073&lt;BR /&gt;2&amp;nbsp; &amp;nbsp; &amp;nbsp; 5&amp;nbsp; &amp;nbsp;1564623266090&lt;BR /&gt;2&amp;nbsp; &amp;nbsp; &amp;nbsp; 6&amp;nbsp; &amp;nbsp;1564623266106&lt;/P&gt;&lt;P&gt;.&lt;/P&gt;&lt;P&gt;.&lt;/P&gt;&lt;P&gt;.&lt;/P&gt;&lt;P&gt;2&amp;nbsp; &amp;nbsp; 450&amp;nbsp; 1564623272324&amp;nbsp; &amp;nbsp; 1564623272356&amp;nbsp; &amp;nbsp;2&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;And to repeat this per subject. Essentially I need to know what trial in the shorter set (onloadtime) corresponds to what closest column in the long set (time), and to the indicate what time trial that actually is. I hope that makes sense. I'd love any support and would be happy to answer any questions.&amp;nbsp;&lt;/P&gt;&lt;/DIV&gt;&lt;/DIV&gt;&lt;/DIV&gt;&lt;/DIV&gt;</description>
      <pubDate>Mon, 18 Nov 2019 02:43:55 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Finding-the-closest-value-from-two-datasets/m-p/604869#M175415</guid>
      <dc:creator>davidsmarch</dc:creator>
      <dc:date>2019-11-18T02:43:55Z</dc:date>
    </item>
    <item>
      <title>Re: Finding the closest value from two datasets</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Finding-the-closest-value-from-two-datasets/m-p/605000#M175480</link>
      <description>&lt;P&gt;You can, of course, write an elegant solution in SQL, submit it, and wait a couple of days before getting the result.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;An alternative may be a data step solution something like this (probably a lot faster):&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;data time;
input subj col   time_n;
cards;
2      1    1564623266038
2      2    1564623266044
2      3    1564623266058
2      4    1564623266073
2      5    1564623266090
2      6    1564623266106
;run;

data onloadtime;
  input subj trial onloadtime_n;
cards;
2      1    1564623265779
;run;

Data v_interval/view=v_interval;
  set time;
  by subj;
  if first.subj then do;
    next_time=time_n;
      time_n=-1E38;
      output;
      time_n=next_time;
      end;
  if not last.subj then do;
    _N_=_N_+1;
    set time(keep=time_n rename=(time_n=next_time)) point=_N_;
    end;
  else next_time=1E38;
  output;
run;



data want;
  set v_interval(rename=(time_n=onloadtime_n))  onloadtime(in=load);
  by subj onloadtime_n;
  if load then do;
    if onloadtime_n-_last_time&amp;lt;_next_time-onloadtime_n then
      time_n=_last_time;
    else 
      time_n=_next_time;
´   output;
    end;
  else do;
    _next_time=next_time;
    _last_time=onloadtime_n;
    _col=col;
    end;
  retain _:;
  keep subj time_n onloadtime_n trial _col;
  rename _col=col;
run;&lt;/CODE&gt;&lt;/PRE&gt;
&lt;P&gt;Both tables have til be sorted by SUBJ and their time variable for this to work. The datastep view creates a table of time intervals using SET with POINT= for look-ahead. The values -1E38 and 1E38 are supposed to be "infinitely" lower resp. higher than any actual time values in the data.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Then it is just a matter of setting the view and the load-table in order, and assigning the right value to the TIME_N variable.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Mon, 18 Nov 2019 13:55:28 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Finding-the-closest-value-from-two-datasets/m-p/605000#M175480</guid>
      <dc:creator>s_lassen</dc:creator>
      <dc:date>2019-11-18T13:55:28Z</dc:date>
    </item>
    <item>
      <title>Re: Finding the closest value from two datasets</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Finding-the-closest-value-from-two-datasets/m-p/606152#M175980</link>
      <description>&lt;P&gt;That's fantastic. Super quick process, thanks for the solution!&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;I'm trying to understand how it works and my ability there is limited. I follow about half of what's going on.&lt;/P&gt;</description>
      <pubDate>Thu, 21 Nov 2019 14:36:06 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Finding-the-closest-value-from-two-datasets/m-p/606152#M175980</guid>
      <dc:creator>davidsmarch</dc:creator>
      <dc:date>2019-11-21T14:36:06Z</dc:date>
    </item>
  </channel>
</rss>

