<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: Fastest way to execute in SAS Procedures</title>
    <link>https://communities.sas.com/t5/SAS-Procedures/Fastest-way-to-execute/m-p/208976#M51795</link>
    <description>&lt;HTML&gt;&lt;HEAD&gt;&lt;/HEAD&gt;&lt;BODY&gt;&lt;P&gt;Indexing by ID and putting the small table first reduces time from 20 minutes (I looked back at the log for actual results, I waaay underestimated earlier. Really 20 minutes!!!) to 12 minutes. I'm testing some of the other ideas.&lt;/P&gt;&lt;/BODY&gt;&lt;/HTML&gt;</description>
    <pubDate>Tue, 24 Mar 2015 14:43:31 GMT</pubDate>
    <dc:creator>rwnj</dc:creator>
    <dc:date>2015-03-24T14:43:31Z</dc:date>
    <item>
      <title>Fastest way to execute</title>
      <link>https://communities.sas.com/t5/SAS-Procedures/Fastest-way-to-execute/m-p/208972#M51791</link>
      <description>&lt;HTML&gt;&lt;HEAD&gt;&lt;/HEAD&gt;&lt;BODY&gt;&lt;P&gt;I frequently get requests for records we store in a large sas file. The table has approx 3 million records and 500 fields (about 10gb). The records I need I import into in a small table of approx 2000-3000 records, with only one field. Using PROC SQL is unacceptably slow, this would take probably a fraction of a second on a sql server. Instead SAS takes 12-15 minutes. How can I speed this up?&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;proc sql;&lt;/P&gt;&lt;P&gt;create table results as&lt;/P&gt;&lt;P&gt;select a.*&lt;/P&gt;&lt;P&gt;from&lt;/P&gt;&lt;P&gt;large_table a&lt;/P&gt;&lt;P&gt;left join&lt;/P&gt;&lt;P&gt;small_table b&lt;/P&gt;&lt;P&gt;on a.record_id = b.record_id&lt;/P&gt;&lt;P&gt;where b.record_id ne '';&lt;/P&gt;&lt;P&gt;quit;&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;If I have a smaller number of records, generally I'll just put the records in a where condition and load the file into memory (I have 16gb ram) via SASFILE, if there are only a couple hundred records, I get results in maybe 1-2 seconds.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;sasfile.large_table load;&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;proc sql;&lt;/P&gt;&lt;P&gt;create table results as&lt;/P&gt;&lt;P&gt;select *&lt;/P&gt;&lt;P&gt;from large_table&lt;/P&gt;&lt;P&gt;where record_id in ('0000000001','0000000002');&lt;/P&gt;&lt;P&gt;quit;&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;This method works great!! But there's a limit to how many records I can put in the where statement. I'd like to find a method that works as fast, using a selection criteria based on the contents of a small sas table.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;Thanks&lt;/P&gt;&lt;/BODY&gt;&lt;/HTML&gt;</description>
      <pubDate>Tue, 24 Mar 2015 14:13:53 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Procedures/Fastest-way-to-execute/m-p/208972#M51791</guid>
      <dc:creator>rwnj</dc:creator>
      <dc:date>2015-03-24T14:13:53Z</dc:date>
    </item>
    <item>
      <title>Re: Fastest way to execute</title>
      <link>https://communities.sas.com/t5/SAS-Procedures/Fastest-way-to-execute/m-p/208973#M51792</link>
      <description>&lt;HTML&gt;&lt;HEAD&gt;&lt;/HEAD&gt;&lt;BODY&gt;&lt;P&gt;If the second code works quickly I am guessing that the large table is sorted/indexed by ID, for the first piece of code make sure the small table is also sorted/indexed by ID.&amp;nbsp; You should put the small table first in your code, the first table you reference will be put into your memory.&amp;nbsp; Use the smaller indexed table first and it might improve performance. &lt;/P&gt;&lt;/BODY&gt;&lt;/HTML&gt;</description>
      <pubDate>Tue, 24 Mar 2015 14:19:55 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Procedures/Fastest-way-to-execute/m-p/208973#M51792</guid>
      <dc:creator>Steelers_In_DC</dc:creator>
      <dc:date>2015-03-24T14:19:55Z</dc:date>
    </item>
    <item>
      <title>Re: Fastest way to execute</title>
      <link>https://communities.sas.com/t5/SAS-Procedures/Fastest-way-to-execute/m-p/208974#M51793</link>
      <description>&lt;HTML&gt;&lt;HEAD&gt;&lt;/HEAD&gt;&lt;BODY&gt;&lt;P&gt;Why use a left join, an inner join would do it without having to put a where statement.&lt;/P&gt;&lt;/BODY&gt;&lt;/HTML&gt;</description>
      <pubDate>Tue, 24 Mar 2015 14:25:44 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Procedures/Fastest-way-to-execute/m-p/208974#M51793</guid>
      <dc:creator>tlk</dc:creator>
      <dc:date>2015-03-24T14:25:44Z</dc:date>
    </item>
    <item>
      <title>Re: Fastest way to execute</title>
      <link>https://communities.sas.com/t5/SAS-Procedures/Fastest-way-to-execute/m-p/208975#M51794</link>
      <description>&lt;HTML&gt;&lt;HEAD&gt;&lt;/HEAD&gt;&lt;BODY&gt;&lt;P&gt;1. If your second solution worked well for you, then why not stick to it and using a subquery to accommodate your small table:&lt;/P&gt;&lt;P&gt;sasfile.large_table load;&lt;/P&gt;&lt;P&gt; proc sql;&lt;/P&gt;&lt;P&gt;create table results as&lt;/P&gt;&lt;P&gt;select *&lt;/P&gt;&lt;P&gt;from large_table&lt;/P&gt;&lt;P&gt;where record_id in (select record_id from small_table);&lt;/P&gt;&lt;P&gt;quit;&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;2. You could try Hash table by putting the small table in the Hash:&lt;/P&gt;&lt;P&gt;data want;&lt;/P&gt;&lt;P&gt;if _n_=1 then do;&lt;/P&gt;&lt;P&gt;declare hash h(dataset:'small_table');&lt;/P&gt;&lt;P&gt;h.definekey('record_id');&lt;/P&gt;&lt;P&gt;h.definedone();&lt;/P&gt;&lt;P&gt;end;&lt;/P&gt;&lt;P&gt;set large_table;&lt;/P&gt;&lt;P&gt;if h.check()=0;&lt;/P&gt;&lt;P&gt;run;&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;3. If both of your tables are presorted by 'record_id', merge can also be an option:&lt;/P&gt;&lt;P&gt;data want;&lt;/P&gt;&lt;P&gt;merge small(in=s) big(in=b);&lt;/P&gt;&lt;P&gt;by record_id;&lt;/P&gt;&lt;P&gt;if a and b;&lt;/P&gt;&lt;P&gt;run;&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;BTW, I am puzzled by your first solution, isn't it to be 'right join' in this case?&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;Haikuo &lt;/P&gt;&lt;/BODY&gt;&lt;/HTML&gt;</description>
      <pubDate>Tue, 24 Mar 2015 14:35:14 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Procedures/Fastest-way-to-execute/m-p/208975#M51794</guid>
      <dc:creator>Haikuo</dc:creator>
      <dc:date>2015-03-24T14:35:14Z</dc:date>
    </item>
    <item>
      <title>Re: Fastest way to execute</title>
      <link>https://communities.sas.com/t5/SAS-Procedures/Fastest-way-to-execute/m-p/208976#M51795</link>
      <description>&lt;HTML&gt;&lt;HEAD&gt;&lt;/HEAD&gt;&lt;BODY&gt;&lt;P&gt;Indexing by ID and putting the small table first reduces time from 20 minutes (I looked back at the log for actual results, I waaay underestimated earlier. Really 20 minutes!!!) to 12 minutes. I'm testing some of the other ideas.&lt;/P&gt;&lt;/BODY&gt;&lt;/HTML&gt;</description>
      <pubDate>Tue, 24 Mar 2015 14:43:31 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Procedures/Fastest-way-to-execute/m-p/208976#M51795</guid>
      <dc:creator>rwnj</dc:creator>
      <dc:date>2015-03-24T14:43:31Z</dc:date>
    </item>
    <item>
      <title>Re: Fastest way to execute</title>
      <link>https://communities.sas.com/t5/SAS-Procedures/Fastest-way-to-execute/m-p/208977#M51796</link>
      <description>&lt;HTML&gt;&lt;HEAD&gt;&lt;/HEAD&gt;&lt;BODY&gt;&lt;P&gt;Another approach if the second method worked for you ...&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;A single macro variable is plenty large enough to hold 3000 quoted IDs.&amp;nbsp; Why not automate that process:&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;proc sql noprint;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&amp;nbsp; select "'" || strip(record_id) || "'" into : macro_var separated by ',' from smaller_table;&lt;/P&gt;&lt;P&gt;quit;&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;Then use the macro variable:&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;proc sql;&lt;/P&gt;&lt;P&gt;create table results as&lt;/P&gt;&lt;P&gt;select *&lt;/P&gt;&lt;P&gt;from large_table&lt;/P&gt;&lt;P&gt;where record_id in (&amp;amp;macro_var);&lt;/P&gt;&lt;P&gt;quit;&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;It still has to be tested for timing, since searching through a list of 3,000 will take longer than searching through a list of 3.&amp;nbsp; But any solution based on a single table is likely to run faster.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;Good luck.&lt;/P&gt;&lt;/BODY&gt;&lt;/HTML&gt;</description>
      <pubDate>Tue, 24 Mar 2015 14:48:42 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Procedures/Fastest-way-to-execute/m-p/208977#M51796</guid>
      <dc:creator>Astounding</dc:creator>
      <dc:date>2015-03-24T14:48:42Z</dc:date>
    </item>
    <item>
      <title>Re: Fastest way to execute</title>
      <link>https://communities.sas.com/t5/SAS-Procedures/Fastest-way-to-execute/m-p/208978#M51797</link>
      <description>&lt;HTML&gt;&lt;HEAD&gt;&lt;/HEAD&gt;&lt;BODY&gt;&lt;P&gt;OK, we have a winner Hai.kuo, answer #1. I should have thought to use a subquery, I guess I didn't know how it would execute with sasfile. The following (once sasfile loads) executes in 4.1 seconds:&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;sasfile.large_table load;&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;proc sql;&lt;/P&gt;&lt;P&gt;create table results as&lt;/P&gt;&lt;P&gt;select *&lt;/P&gt;&lt;P&gt;from large_table&lt;/P&gt;&lt;P&gt;where record_id in (select record_id from small_table);&lt;/P&gt;&lt;P&gt;quit;&lt;/P&gt;&lt;/BODY&gt;&lt;/HTML&gt;</description>
      <pubDate>Tue, 24 Mar 2015 14:49:24 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Procedures/Fastest-way-to-execute/m-p/208978#M51797</guid>
      <dc:creator>rwnj</dc:creator>
      <dc:date>2015-03-24T14:49:24Z</dc:date>
    </item>
  </channel>
</rss>

