<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: Faster way to do this in data step? in SAS Procedures</title>
    <link>https://communities.sas.com/t5/SAS-Procedures/Faster-way-to-do-this-in-data-step/m-p/63655#M18104</link>
    <description>I have found similiar problems with SQL joins of large data sets.  While the data step is usually faster, merging requires that the joined data be presorted.  I have have began experimenting using Hash tables in the data step to join records.  While I'm not comfortable fully explaining them, they have provided the performance enhancements that I wanted.  There are many good SAS Global forum papers that explain and give examples.  Just search hash tables in tech support search.&lt;BR /&gt;
&lt;BR /&gt;
Here's one paper....&lt;BR /&gt;
&lt;BR /&gt;
&lt;A href="http://www2.sas.com/proceedings/forum2008/029-2008.pdf" target="_blank"&gt;http://www2.sas.com/proceedings/forum2008/029-2008.pdf&lt;/A&gt;</description>
    <pubDate>Wed, 12 Aug 2009 15:34:52 GMT</pubDate>
    <dc:creator>LAP</dc:creator>
    <dc:date>2009-08-12T15:34:52Z</dc:date>
    <item>
      <title>Faster way to do this in data step?</title>
      <link>https://communities.sas.com/t5/SAS-Procedures/Faster-way-to-do-this-in-data-step/m-p/63654#M18103</link>
      <description>Sorry for the noobish question. I'm a sql background trying to get a handle on SAS...&lt;BR /&gt;
&lt;BR /&gt;
So I've been adding a variable based on matching observations to a second table...&lt;BR /&gt;
&lt;BR /&gt;
proc sql;&lt;BR /&gt;
  create table matching as&lt;BR /&gt;
  select a.*, b.id_final_new as match_id&lt;BR /&gt;
  from people a left join responders b&lt;BR /&gt;
  on a.id_one=b.final_id_one and a.id_two=b.final_id_two;&lt;BR /&gt;
quit;&lt;BR /&gt;
&lt;BR /&gt;
This takes forever! Can I do the same in a quick data step?</description>
      <pubDate>Wed, 12 Aug 2009 13:59:59 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Procedures/Faster-way-to-do-this-in-data-step/m-p/63654#M18103</guid>
      <dc:creator>deleted_user</dc:creator>
      <dc:date>2009-08-12T13:59:59Z</dc:date>
    </item>
    <item>
      <title>Re: Faster way to do this in data step?</title>
      <link>https://communities.sas.com/t5/SAS-Procedures/Faster-way-to-do-this-in-data-step/m-p/63655#M18104</link>
      <description>I have found similiar problems with SQL joins of large data sets.  While the data step is usually faster, merging requires that the joined data be presorted.  I have have began experimenting using Hash tables in the data step to join records.  While I'm not comfortable fully explaining them, they have provided the performance enhancements that I wanted.  There are many good SAS Global forum papers that explain and give examples.  Just search hash tables in tech support search.&lt;BR /&gt;
&lt;BR /&gt;
Here's one paper....&lt;BR /&gt;
&lt;BR /&gt;
&lt;A href="http://www2.sas.com/proceedings/forum2008/029-2008.pdf" target="_blank"&gt;http://www2.sas.com/proceedings/forum2008/029-2008.pdf&lt;/A&gt;</description>
      <pubDate>Wed, 12 Aug 2009 15:34:52 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Procedures/Faster-way-to-do-this-in-data-step/m-p/63655#M18104</guid>
      <dc:creator>LAP</dc:creator>
      <dc:date>2009-08-12T15:34:52Z</dc:date>
    </item>
    <item>
      <title>Re: Faster way to do this in data step?</title>
      <link>https://communities.sas.com/t5/SAS-Procedures/Faster-way-to-do-this-in-data-step/m-p/63656#M18105</link>
      <description>Thanks, looks like I've got a lot of reading to do...</description>
      <pubDate>Wed, 12 Aug 2009 20:03:05 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Procedures/Faster-way-to-do-this-in-data-step/m-p/63656#M18105</guid>
      <dc:creator>deleted_user</dc:creator>
      <dc:date>2009-08-12T20:03:05Z</dc:date>
    </item>
    <item>
      <title>Re: Faster way to do this in data step?</title>
      <link>https://communities.sas.com/t5/SAS-Procedures/Faster-way-to-do-this-in-data-step/m-p/63657#M18106</link>
      <description>You can test this code. If dataset people is not very large then should work very fast.&lt;BR /&gt;
&lt;BR /&gt;
 data responders(keep=id_one id_two id_final index=(cr1=(id_one id_two)));&lt;BR /&gt;
  set  responders(rename=(final_id_one=id_one final_id_two=id_two));&lt;BR /&gt;
 run;&lt;BR /&gt;
&lt;BR /&gt;
 data matching;&lt;BR /&gt;
  set people;&lt;BR /&gt;
  set responders key=cr1/unique;&lt;BR /&gt;
    select(_iorc_);&lt;BR /&gt;
      when(%sysrc(_sok)) do; end;&lt;BR /&gt;
      when(%sysrc(_dsenom)) do; id_final= . /* if id_final is char then should be id_final=''*/; _error_ = 0; end;&lt;BR /&gt;
      otherwise do;&lt;BR /&gt;
         put 'ERROR_: Unexpected value for _IORC_= ' _iorc_ ' Program terminating. Data set accessed is responders';&lt;BR /&gt;
         put _all_; _error_ = 0; stop; end;&lt;BR /&gt;
    end;&lt;BR /&gt;
 run;</description>
      <pubDate>Thu, 13 Aug 2009 04:42:06 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Procedures/Faster-way-to-do-this-in-data-step/m-p/63657#M18106</guid>
      <dc:creator>Oleg_L</dc:creator>
      <dc:date>2009-08-13T04:42:06Z</dc:date>
    </item>
    <item>
      <title>Re: Faster way to do this in data step?</title>
      <link>https://communities.sas.com/t5/SAS-Procedures/Faster-way-to-do-this-in-data-step/m-p/63658#M18107</link>
      <description>You could investigate using a hash table in a data step.&lt;BR /&gt;
While the syntax is unusual at first glance, it is quite easy to grasp.&lt;BR /&gt;
Hash tables are fast as they are loaded in memory, and don't require any prior sorting.&lt;BR /&gt;
The pseudo code looks like:&lt;BR /&gt;
data matching ;&lt;BR /&gt;
  if _N_=1 then load hash table;&lt;BR /&gt;
  set base table;&lt;BR /&gt;
  find match;&lt;BR /&gt;
run;</description>
      <pubDate>Thu, 13 Aug 2009 05:31:00 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Procedures/Faster-way-to-do-this-in-data-step/m-p/63658#M18107</guid>
      <dc:creator>ChrisNZ</dc:creator>
      <dc:date>2009-08-13T05:31:00Z</dc:date>
    </item>
  </channel>
</rss>

