<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: Help with efficiency dealing with large dataset (find first/last visit date) in SAS Programming</title>
    <link>https://communities.sas.com/t5/SAS-Programming/Help-with-efficiency-dealing-with-large-dataset-find-first-last/m-p/90071#M19117</link>
    <description>&lt;HTML&gt;&lt;HEAD&gt;&lt;/HEAD&gt;&lt;BODY&gt;&lt;P&gt;If your source data is something other than SAS table, say Oracle, SQL server, then you will have an option doing it using pass-thru. Other than that, you are stuck with Proc Sort. I doubt if Hash table could help, but first you need to make sure your whole table can be fitted into your RAM, and even if it can, I suspect that the Hash sorting would be more efficient than Proc sort.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;my 2cents,&lt;/P&gt;&lt;P&gt;Haikuo &lt;/P&gt;&lt;/BODY&gt;&lt;/HTML&gt;</description>
    <pubDate>Wed, 27 Mar 2013 16:20:49 GMT</pubDate>
    <dc:creator>Haikuo</dc:creator>
    <dc:date>2013-03-27T16:20:49Z</dc:date>
    <item>
      <title>Help with efficiency dealing with large dataset (find first/last visit date)</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Help-with-efficiency-dealing-with-large-dataset-find-first-last/m-p/90069#M19115</link>
      <description>&lt;HTML&gt;&lt;HEAD&gt;&lt;/HEAD&gt;&lt;BODY&gt;&lt;P&gt;Hello,&lt;/P&gt;&lt;P&gt;I am working with a dataset that has a few million records, and instead of sorting by id and visit date, then using a data step to take the first. and last. visit dates for each id I would be interested in a more efficient way to get the data. The proc sort on the dataset takes forever. Any help would be appreciated. Thanks!&lt;/P&gt;&lt;P&gt;-Steve&lt;/P&gt;&lt;/BODY&gt;&lt;/HTML&gt;</description>
      <pubDate>Wed, 27 Mar 2013 15:48:39 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Help-with-efficiency-dealing-with-large-dataset-find-first-last/m-p/90069#M19115</guid>
      <dc:creator>browste</dc:creator>
      <dc:date>2013-03-27T15:48:39Z</dc:date>
    </item>
    <item>
      <title>Re: Help with efficiency dealing with large dataset (find first/last visit date)</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Help-with-efficiency-dealing-with-large-dataset-find-first-last/m-p/90070#M19116</link>
      <description>&lt;HTML&gt;&lt;HEAD&gt;&lt;/HEAD&gt;&lt;BODY&gt;&lt;P&gt;I would be tempted to use PROC SUMMARY and CLASS statements.&lt;/P&gt;&lt;P&gt;Something like:&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;proc summary data =have ;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; class ID;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; var VisitDate;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; output out=want max min /autoname;&lt;/P&gt;&lt;P&gt;run;&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;or&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;Proc sql;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; create table want as&lt;/P&gt;&lt;P&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; select id, min(visitdate) as firstdate, max(visitdate) as lastdate&lt;/P&gt;&lt;P&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; from have&lt;/P&gt;&lt;P&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; group by id;&lt;/P&gt;&lt;P&gt;quit;&lt;/P&gt;&lt;/BODY&gt;&lt;/HTML&gt;</description>
      <pubDate>Wed, 27 Mar 2013 16:15:34 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Help-with-efficiency-dealing-with-large-dataset-find-first-last/m-p/90070#M19116</guid>
      <dc:creator>ballardw</dc:creator>
      <dc:date>2013-03-27T16:15:34Z</dc:date>
    </item>
    <item>
      <title>Re: Help with efficiency dealing with large dataset (find first/last visit date)</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Help-with-efficiency-dealing-with-large-dataset-find-first-last/m-p/90071#M19117</link>
      <description>&lt;HTML&gt;&lt;HEAD&gt;&lt;/HEAD&gt;&lt;BODY&gt;&lt;P&gt;If your source data is something other than SAS table, say Oracle, SQL server, then you will have an option doing it using pass-thru. Other than that, you are stuck with Proc Sort. I doubt if Hash table could help, but first you need to make sure your whole table can be fitted into your RAM, and even if it can, I suspect that the Hash sorting would be more efficient than Proc sort.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;my 2cents,&lt;/P&gt;&lt;P&gt;Haikuo &lt;/P&gt;&lt;/BODY&gt;&lt;/HTML&gt;</description>
      <pubDate>Wed, 27 Mar 2013 16:20:49 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Help-with-efficiency-dealing-with-large-dataset-find-first-last/m-p/90071#M19117</guid>
      <dc:creator>Haikuo</dc:creator>
      <dc:date>2013-03-27T16:20:49Z</dc:date>
    </item>
    <item>
      <title>Re: Help with efficiency dealing with large dataset (find first/last visit date)</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Help-with-efficiency-dealing-with-large-dataset-find-first-last/m-p/90072#M19118</link>
      <description>&lt;HTML&gt;&lt;HEAD&gt;&lt;/HEAD&gt;&lt;BODY&gt;&lt;P&gt;Thanks so much ballardw, both of those are much, much quicker. I really appreciate it!&lt;/P&gt;&lt;P&gt;-Steve&lt;/P&gt;&lt;/BODY&gt;&lt;/HTML&gt;</description>
      <pubDate>Wed, 27 Mar 2013 16:31:14 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Help-with-efficiency-dealing-with-large-dataset-find-first-last/m-p/90072#M19118</guid>
      <dc:creator>browste</dc:creator>
      <dc:date>2013-03-27T16:31:14Z</dc:date>
    </item>
  </channel>
</rss>

