<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: Theoretical merging question in SAS Data Management</title>
    <link>https://communities.sas.com/t5/SAS-Data-Management/Theoretical-merging-question/m-p/159247#M2898</link>
    <description>&lt;HTML&gt;&lt;HEAD&gt;&lt;/HEAD&gt;&lt;BODY&gt;&lt;P&gt;Based on the information as provided from your related question under: &lt;A _jive_internal="true" class="active_link" href="https://communities.sas.com/message/240300#240300" style="font-family: 'Helvetica Neue', Helvetica, Arial, 'Lucida Grande', sans-serif; color: #0e66ba; background-color: #ffffff;"&gt;https://communities.sas.com/message/240300#240300&lt;/A&gt;&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;Given that the look-up table (EoY data) should fit into a hash we could take such an approach which should perform quite well. In below code I'm not 100% sure if I've got the calculation of the closest EoY date correct so you might need to amend this.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;data daily_data;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; infile cards dlm=',';&lt;/P&gt;&lt;P&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; input date :ddmmyy10. ticker $;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; format date ddmmyy10.;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; cards;&lt;/P&gt;&lt;P&gt;12/12/2008,ABC&lt;/P&gt;&lt;P&gt;4/6/2008,ABC&lt;/P&gt;&lt;P&gt;25/3/2010,DEF&lt;/P&gt;&lt;P&gt;6/2/2002,DEF&lt;/P&gt;&lt;P&gt;;&lt;/P&gt;&lt;P&gt;run;&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;data eoy_data;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; infile cards dlm=',';&lt;/P&gt;&lt;P&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; input date :ddmmyy10. ticker $;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; format date ddmmyy10.;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; eoy_amt=_n_;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; cards;&lt;/P&gt;&lt;P&gt;1/12/2010,ABC&lt;/P&gt;&lt;P&gt;1/12/2009,ABC&lt;/P&gt;&lt;P&gt;1/12/2008,ABC&lt;/P&gt;&lt;P&gt;1/12/2007,ABC&lt;/P&gt;&lt;P&gt;1/12/2006,ABC&lt;/P&gt;&lt;P&gt;1/12/2010,DEF&lt;/P&gt;&lt;P&gt;1/12/2009,DEF&lt;/P&gt;&lt;P&gt;1/12/2008,DEF&lt;/P&gt;&lt;P&gt;1/12/2007,DEF&lt;/P&gt;&lt;P&gt;1/12/2001,DEF&lt;/P&gt;&lt;P&gt;;&lt;/P&gt;&lt;P&gt;run;&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;data want;&lt;/P&gt;&lt;P&gt;&amp;nbsp; set daily_data;&lt;/P&gt;&lt;P&gt;&amp;nbsp; if _n_=1 then &lt;/P&gt;&lt;P&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp; do;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; if 0 then set eoy_data(rename=(date=eoy_date));&lt;/P&gt;&lt;P&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; dcl hash h1(dataset:'eoy_data(rename=(date=eoy_date))');&lt;/P&gt;&lt;P&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; _rc=h1.defineKey('ticker','eoy_date');&lt;/P&gt;&lt;P&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; _rc=h1.defineData(all:'y');&lt;/P&gt;&lt;P&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; _rc=h1.defineDone();&lt;/P&gt;&lt;P&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp; end;&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp; /** find closest 1 Dec&amp;lt;yyyy&amp;gt; date **/&lt;/P&gt;&lt;P&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp; format _match_date ddmmyy10.;&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp; if abs(intnx('year',date,-1,'e') - date) &amp;lt;= abs(intnx('year.12',date,0,'e') - date) then&lt;/P&gt;&lt;P&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; _match_date=intnx('year.12',date,0,'b');&lt;/P&gt;&lt;P&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp; else &lt;/P&gt;&lt;P&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; _match_date=intnx('year.12',date,1,'b');&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp; /* look up eoy data */&lt;/P&gt;&lt;P&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp; _rc=h1.find(key:ticker, key:_match_date);&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;run;&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;Message was edited by: Patrick Matter &#xD;
Oh, I believe I get it now. Your EoY data has a date of 1 December &amp;lt;yyyy&amp;gt; but this means like so often "month starting on...". The actual data is from the end of the month (the year) and that's where you want the date difference to be calculated from. I've amended the code accordingly.&lt;/P&gt;&lt;/BODY&gt;&lt;/HTML&gt;</description>
    <pubDate>Mon, 22 Dec 2014 02:01:33 GMT</pubDate>
    <dc:creator>Patrick</dc:creator>
    <dc:date>2014-12-22T02:01:33Z</dc:date>
    <item>
      <title>Theoretical merging question</title>
      <link>https://communities.sas.com/t5/SAS-Data-Management/Theoretical-merging-question/m-p/159244#M2895</link>
      <description>&lt;HTML&gt;&lt;HEAD&gt;&lt;/HEAD&gt;&lt;BODY&gt;&lt;P&gt;Hi,&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;I have 2 tables, one has 76 000 rows and the other 22 000 rows, and I want to merge them according to a certain criteria.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;I read that regardless of the merging criteria that I set, SAS will always multiply the tables in its calculation, and in this case the multiplication is equal to 1.6 Billion.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;The reason why I am asking this question is because I ran the code an hour ago and its still thinking.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;Can there be a shortcut?&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;Thank you&lt;/P&gt;&lt;/BODY&gt;&lt;/HTML&gt;</description>
      <pubDate>Sun, 21 Dec 2014 17:48:29 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Data-Management/Theoretical-merging-question/m-p/159244#M2895</guid>
      <dc:creator>ilikesas</dc:creator>
      <dc:date>2014-12-21T17:48:29Z</dc:date>
    </item>
    <item>
      <title>Re: Theoretical merging question</title>
      <link>https://communities.sas.com/t5/SAS-Data-Management/Theoretical-merging-question/m-p/159245#M2896</link>
      <description>&lt;HTML&gt;&lt;HEAD&gt;&lt;/HEAD&gt;&lt;BODY&gt;&lt;P&gt;Your question is not very precise.&lt;/P&gt;&lt;P&gt;If you are using data step merge by, it does not multiply. It does more of a roe by row merge.&lt;/P&gt;&lt;P&gt;SQL on the other, does "logically" a Cartesian product, and applies any where/on clause. But in real world, the SQL optimizer/planner does prevent the total product to be created.&lt;/P&gt;&lt;P&gt;Share your code and sample data for more precise answers.&lt;/P&gt;&lt;/BODY&gt;&lt;/HTML&gt;</description>
      <pubDate>Sun, 21 Dec 2014 18:46:53 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Data-Management/Theoretical-merging-question/m-p/159245#M2896</guid>
      <dc:creator>LinusH</dc:creator>
      <dc:date>2014-12-21T18:46:53Z</dc:date>
    </item>
    <item>
      <title>Re: Theoretical merging question</title>
      <link>https://communities.sas.com/t5/SAS-Data-Management/Theoretical-merging-question/m-p/159246#M2897</link>
      <description>&lt;HTML&gt;&lt;HEAD&gt;&lt;/HEAD&gt;&lt;BODY&gt;&lt;P&gt;Hi LinusH,&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;I already asked this question on the community blog &lt;/P&gt;&lt;P&gt;&lt;A _jive_internal="true" href="https://communities.sas.com/message/240300#240300"&gt;https://communities.sas.com/message/240300#240300&lt;/A&gt;&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;there I have a small sample and community members gave me an answer that works good for my small sample, but with the real data that I have it takes me already 3 hrs&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;I guess it is better to go back to the link of my initial question &lt;/P&gt;&lt;/BODY&gt;&lt;/HTML&gt;</description>
      <pubDate>Sun, 21 Dec 2014 19:53:30 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Data-Management/Theoretical-merging-question/m-p/159246#M2897</guid>
      <dc:creator>ilikesas</dc:creator>
      <dc:date>2014-12-21T19:53:30Z</dc:date>
    </item>
    <item>
      <title>Re: Theoretical merging question</title>
      <link>https://communities.sas.com/t5/SAS-Data-Management/Theoretical-merging-question/m-p/159247#M2898</link>
      <description>&lt;HTML&gt;&lt;HEAD&gt;&lt;/HEAD&gt;&lt;BODY&gt;&lt;P&gt;Based on the information as provided from your related question under: &lt;A _jive_internal="true" class="active_link" href="https://communities.sas.com/message/240300#240300" style="font-family: 'Helvetica Neue', Helvetica, Arial, 'Lucida Grande', sans-serif; color: #0e66ba; background-color: #ffffff;"&gt;https://communities.sas.com/message/240300#240300&lt;/A&gt;&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;Given that the look-up table (EoY data) should fit into a hash we could take such an approach which should perform quite well. In below code I'm not 100% sure if I've got the calculation of the closest EoY date correct so you might need to amend this.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;data daily_data;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; infile cards dlm=',';&lt;/P&gt;&lt;P&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; input date :ddmmyy10. ticker $;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; format date ddmmyy10.;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; cards;&lt;/P&gt;&lt;P&gt;12/12/2008,ABC&lt;/P&gt;&lt;P&gt;4/6/2008,ABC&lt;/P&gt;&lt;P&gt;25/3/2010,DEF&lt;/P&gt;&lt;P&gt;6/2/2002,DEF&lt;/P&gt;&lt;P&gt;;&lt;/P&gt;&lt;P&gt;run;&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;data eoy_data;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; infile cards dlm=',';&lt;/P&gt;&lt;P&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; input date :ddmmyy10. ticker $;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; format date ddmmyy10.;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; eoy_amt=_n_;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; cards;&lt;/P&gt;&lt;P&gt;1/12/2010,ABC&lt;/P&gt;&lt;P&gt;1/12/2009,ABC&lt;/P&gt;&lt;P&gt;1/12/2008,ABC&lt;/P&gt;&lt;P&gt;1/12/2007,ABC&lt;/P&gt;&lt;P&gt;1/12/2006,ABC&lt;/P&gt;&lt;P&gt;1/12/2010,DEF&lt;/P&gt;&lt;P&gt;1/12/2009,DEF&lt;/P&gt;&lt;P&gt;1/12/2008,DEF&lt;/P&gt;&lt;P&gt;1/12/2007,DEF&lt;/P&gt;&lt;P&gt;1/12/2001,DEF&lt;/P&gt;&lt;P&gt;;&lt;/P&gt;&lt;P&gt;run;&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;data want;&lt;/P&gt;&lt;P&gt;&amp;nbsp; set daily_data;&lt;/P&gt;&lt;P&gt;&amp;nbsp; if _n_=1 then &lt;/P&gt;&lt;P&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp; do;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; if 0 then set eoy_data(rename=(date=eoy_date));&lt;/P&gt;&lt;P&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; dcl hash h1(dataset:'eoy_data(rename=(date=eoy_date))');&lt;/P&gt;&lt;P&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; _rc=h1.defineKey('ticker','eoy_date');&lt;/P&gt;&lt;P&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; _rc=h1.defineData(all:'y');&lt;/P&gt;&lt;P&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; _rc=h1.defineDone();&lt;/P&gt;&lt;P&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp; end;&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp; /** find closest 1 Dec&amp;lt;yyyy&amp;gt; date **/&lt;/P&gt;&lt;P&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp; format _match_date ddmmyy10.;&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp; if abs(intnx('year',date,-1,'e') - date) &amp;lt;= abs(intnx('year.12',date,0,'e') - date) then&lt;/P&gt;&lt;P&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; _match_date=intnx('year.12',date,0,'b');&lt;/P&gt;&lt;P&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp; else &lt;/P&gt;&lt;P&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; _match_date=intnx('year.12',date,1,'b');&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp; /* look up eoy data */&lt;/P&gt;&lt;P&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp; _rc=h1.find(key:ticker, key:_match_date);&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;run;&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;Message was edited by: Patrick Matter &#xD;
Oh, I believe I get it now. Your EoY data has a date of 1 December &amp;lt;yyyy&amp;gt; but this means like so often "month starting on...". The actual data is from the end of the month (the year) and that's where you want the date difference to be calculated from. I've amended the code accordingly.&lt;/P&gt;&lt;/BODY&gt;&lt;/HTML&gt;</description>
      <pubDate>Mon, 22 Dec 2014 02:01:33 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Data-Management/Theoretical-merging-question/m-p/159247#M2898</guid>
      <dc:creator>Patrick</dc:creator>
      <dc:date>2014-12-22T02:01:33Z</dc:date>
    </item>
  </channel>
</rss>

