<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: What is the most efficient way to join on most recent record? in SAS Programming</title>
    <link>https://communities.sas.com/t5/SAS-Programming/What-is-the-most-efficient-way-to-join-on-most-recent-record/m-p/802102#M315735</link>
    <description>&lt;P&gt;How many rows are there in each table? Is the company database an external RDBMS? If you are joining between a SAS table and an&amp;nbsp;RDBMS then either SAS has to read all of the&amp;nbsp;RDBMS down to SAS and join there, or load the SAS table into the&amp;nbsp;RDBMS and join there. Typically you should load the smaller table into the larger table's environment for best performance.&lt;/P&gt;</description>
    <pubDate>Tue, 15 Mar 2022 03:36:55 GMT</pubDate>
    <dc:creator>SASKiwi</dc:creator>
    <dc:date>2022-03-15T03:36:55Z</dc:date>
    <item>
      <title>What is the most efficient way to join on most recent record?</title>
      <link>https://communities.sas.com/t5/SAS-Programming/What-is-the-most-efficient-way-to-join-on-most-recent-record/m-p/801635#M315500</link>
      <description>&lt;P&gt;Hi everybody!&lt;BR /&gt;&lt;BR /&gt;I have two large datasets that lies in the companys database and I want to join these to another dataset that I have from before in SAS.&lt;/P&gt;&lt;P&gt;These two large datasets contains information about the same account on different dates.&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;What I want to do is join the latest record of a account from both datasets to the first dataset, but I find that the ways I have done this to now is very inefficient and takes forever.&lt;/P&gt;&lt;P&gt;Is there a good way to do this efficient in proc sql?&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;This is the way I have tried to this for only one dataset is something like this, but this is very slow:&lt;BR /&gt;&lt;BR /&gt;&lt;/P&gt;&lt;P&gt;&lt;STRONG&gt;proc&lt;/STRONG&gt; &lt;STRONG&gt;sql&lt;/STRONG&gt;;&lt;/P&gt;&lt;P&gt;create table Loans as&lt;/P&gt;&lt;P&gt;select&lt;/P&gt;&lt;P&gt;a.*,&lt;/P&gt;&lt;P&gt;b.*&lt;/P&gt;&lt;P&gt;&amp;nbsp; FROM customer_information as a inner join&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; (&amp;nbsp;&amp;nbsp;&amp;nbsp; Select *&lt;/P&gt;&lt;P&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; From Business.LOAN_INFORMATION&lt;/P&gt;&lt;P&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; Group by&amp;nbsp; ACCOUNTNR&lt;/P&gt;&lt;P&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; Having DATE_OF_RECORD=max(DATE_OF_RECORD) ) as b&lt;/P&gt;&lt;P&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; on a.ACCOUNTNR = b.ACCOUNTNR;&lt;/P&gt;&lt;P&gt;;&lt;/P&gt;&lt;P&gt;&lt;STRONG&gt;quit&lt;/STRONG&gt;;&lt;/P&gt;</description>
      <pubDate>Fri, 11 Mar 2022 11:11:42 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/What-is-the-most-efficient-way-to-join-on-most-recent-record/m-p/801635#M315500</guid>
      <dc:creator>Wiko87</dc:creator>
      <dc:date>2022-03-11T11:11:42Z</dc:date>
    </item>
    <item>
      <title>Re: What is the most efficient way to join on most recent record?</title>
      <link>https://communities.sas.com/t5/SAS-Programming/What-is-the-most-efficient-way-to-join-on-most-recent-record/m-p/801662#M315515</link>
      <description>&lt;P&gt;Let's say you have three datasets A, B, and C.&amp;nbsp; &amp;nbsp;A is sorted by ID/DATE_FROM_A, B is sorted by ID/DATE_FROM_B, and C by ID/DATE_FROM_C.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Youi want the latest obs from A and from B, but all the obs from C:&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;data want;
  set a (in=ina keep=id)
      b (in=inb keep=id)
      c (in=inc) ;
  by id ;
  if first.id then call missing(of _all_);
  if ina then set a;
  if inb then set b;
  if inc;
run;&lt;/CODE&gt;&lt;/PRE&gt;</description>
      <pubDate>Fri, 11 Mar 2022 13:53:11 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/What-is-the-most-efficient-way-to-join-on-most-recent-record/m-p/801662#M315515</guid>
      <dc:creator>mkeintz</dc:creator>
      <dc:date>2022-03-11T13:53:11Z</dc:date>
    </item>
    <item>
      <title>Re: What is the most efficient way to join on most recent record?</title>
      <link>https://communities.sas.com/t5/SAS-Programming/What-is-the-most-efficient-way-to-join-on-most-recent-record/m-p/801711#M315533</link>
      <description>&lt;P&gt;Just to be a devil's advocate, what definition of "most efficient" are you using?&lt;/P&gt;
&lt;P&gt;Several that come to mind as possibilities:&lt;/P&gt;
&lt;P&gt;Run using the least CPU&lt;/P&gt;
&lt;P&gt;Run using the least memory/disk space/network bandwidth&lt;/P&gt;
&lt;P&gt;Take the least time to program&lt;/P&gt;
&lt;P&gt;Take the fewest lines of code&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;There are others.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;You may want to consider that if your time programming the solution results in $XXXX of man-hour costs you may have spent way more than additional CPU Disk/Memory impacts for a "small" process, small meaning few records and/or steps needed. For very large data sets that spend a lot of time pushing things around a network you may want to consider that impact more than nice code. Trade-offs abound with this sort of problem. Such as how large is "large". We see people think their data is large with a few thousand records and 50 variables. Then there are the terabyte sized datasets...&lt;/P&gt;
&lt;P&gt;Sometimes the fewest lines of code may be the most difficult to understand when you look at them later.&lt;/P&gt;</description>
      <pubDate>Fri, 11 Mar 2022 17:26:29 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/What-is-the-most-efficient-way-to-join-on-most-recent-record/m-p/801711#M315533</guid>
      <dc:creator>ballardw</dc:creator>
      <dc:date>2022-03-11T17:26:29Z</dc:date>
    </item>
    <item>
      <title>Re: What is the most efficient way to join on most recent record?</title>
      <link>https://communities.sas.com/t5/SAS-Programming/What-is-the-most-efficient-way-to-join-on-most-recent-record/m-p/801902#M315630</link>
      <description>&lt;P&gt;Poor choice of word from my part. I was thinking mostly of least time to run.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;And I understand your point.&lt;/P&gt;&lt;P&gt;I have found that the effort to find the how to write the program I want is not something I prioritize for now, but if there is a nice and effective way to do this, I'm very much interested for later projects.&lt;/P&gt;</description>
      <pubDate>Mon, 14 Mar 2022 07:19:22 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/What-is-the-most-efficient-way-to-join-on-most-recent-record/m-p/801902#M315630</guid>
      <dc:creator>Wiko87</dc:creator>
      <dc:date>2022-03-14T07:19:22Z</dc:date>
    </item>
    <item>
      <title>Re: What is the most efficient way to join on most recent record?</title>
      <link>https://communities.sas.com/t5/SAS-Programming/What-is-the-most-efficient-way-to-join-on-most-recent-record/m-p/802102#M315735</link>
      <description>&lt;P&gt;How many rows are there in each table? Is the company database an external RDBMS? If you are joining between a SAS table and an&amp;nbsp;RDBMS then either SAS has to read all of the&amp;nbsp;RDBMS down to SAS and join there, or load the SAS table into the&amp;nbsp;RDBMS and join there. Typically you should load the smaller table into the larger table's environment for best performance.&lt;/P&gt;</description>
      <pubDate>Tue, 15 Mar 2022 03:36:55 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/What-is-the-most-efficient-way-to-join-on-most-recent-record/m-p/802102#M315735</guid>
      <dc:creator>SASKiwi</dc:creator>
      <dc:date>2022-03-15T03:36:55Z</dc:date>
    </item>
  </channel>
</rss>

