<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: Transform Proc SQL to a Hash Object in SAS Programming</title>
    <link>https://communities.sas.com/t5/SAS-Programming/Transform-Proc-SQL-to-a-Hash-Object/m-p/639130#M190069</link>
    <description>&lt;P&gt;&lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/13976"&gt;@SASKiwi&lt;/a&gt; Utility files are not compressed.&lt;/P&gt;
&lt;P&gt;&lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/320380"&gt;@PegaZeus&lt;/a&gt; You could try 2 steps: Join a and b (and maybe f) on id. This will reduce the number of rows, and the sql parser will know what it's dealing with and will adopt the most efficient method. Then do the other joins against that first table. Don't forget the &lt;STRONG&gt;_method&lt;/STRONG&gt; option to monitor the parser's choices.&lt;/P&gt;</description>
    <pubDate>Sat, 11 Apr 2020 08:23:40 GMT</pubDate>
    <dc:creator>ChrisNZ</dc:creator>
    <dc:date>2020-04-11T08:23:40Z</dc:date>
    <item>
      <title>Transform Proc SQL to a Hash Object</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Transform-Proc-SQL-to-a-Hash-Object/m-p/638994#M190025</link>
      <description>&lt;P&gt;SAS seems to run out of work disc space (sorting space) when I run the proc sql, even if I specify the SQL Optimizer to use a hash join (magic=103), so I'm looking to transform the following proc sql into a hash object approach. How I go about doing this:&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;proc sql magic=103;
  create table final as 
  select *
  from have a
  inner join ids b on (a.id = b.id)
  left join nad g on (a.fpc = g.fpc)
  left join pricing e on (a.gpi = e.gpi) and (d.list_id = e.list_id)
  left join network f on (a.id = f.id)
  left join client_segment d on (a.client = d.client) and (a.bob = d.bob) and 
           (a.seg = d.seg) and (a.ex_id = d.ex_id) and (a.aids = d.aids)
  left join instate c on (a.ncp_id = c.ncp_id);
quit;&lt;/CODE&gt;&lt;/PRE&gt;</description>
      <pubDate>Fri, 10 Apr 2020 16:01:54 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Transform-Proc-SQL-to-a-Hash-Object/m-p/638994#M190025</guid>
      <dc:creator>PegaZeus</dc:creator>
      <dc:date>2020-04-10T16:01:54Z</dc:date>
    </item>
    <item>
      <title>Re: Transform Proc SQL to a Hash Object</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Transform-Proc-SQL-to-a-Hash-Object/m-p/638997#M190027</link>
      <description>&lt;P&gt;Don't use the asterisk, use an exhaustive list of wanted variables with correct table aliases. Otherwise you will at least get WARNINGs.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;What are the sizes (in terms of observation numbers) and the sizes of the variables you want to join to your main table? This will determine if you can fit the tables into memory.&lt;/P&gt;</description>
      <pubDate>Fri, 10 Apr 2020 16:08:31 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Transform-Proc-SQL-to-a-Hash-Object/m-p/638997#M190027</guid>
      <dc:creator>Kurt_Bremser</dc:creator>
      <dc:date>2020-04-10T16:08:31Z</dc:date>
    </item>
    <item>
      <title>Re: Transform Proc SQL to a Hash Object</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Transform-Proc-SQL-to-a-Hash-Object/m-p/639024#M190035</link>
      <description>&lt;P&gt;Hey Kurt&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;The size of the base table is approx. 58,000,000 records&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;The sizes of the join fields range from a length of 2 to 14. Fore example, client is 2, gpi is 14, fpc is 11, ncpdp is 5...&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Fri, 10 Apr 2020 17:28:41 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Transform-Proc-SQL-to-a-Hash-Object/m-p/639024#M190035</guid>
      <dc:creator>PegaZeus</dc:creator>
      <dc:date>2020-04-10T17:28:41Z</dc:date>
    </item>
    <item>
      <title>Re: Transform Proc SQL to a Hash Object</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Transform-Proc-SQL-to-a-Hash-Object/m-p/639028#M190036</link>
      <description>&lt;P&gt;We need to know the numbers of obs in all the tables you want to join to the base, and the lengths of the variables you want from there.&lt;/P&gt;</description>
      <pubDate>Fri, 10 Apr 2020 17:49:42 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Transform-Proc-SQL-to-a-Hash-Object/m-p/639028#M190036</guid>
      <dc:creator>Kurt_Bremser</dc:creator>
      <dc:date>2020-04-10T17:49:42Z</dc:date>
    </item>
    <item>
      <title>Re: Transform Proc SQL to a Hash Object</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Transform-Proc-SQL-to-a-Hash-Object/m-p/639088#M190050</link>
      <description>&lt;P&gt;&lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/320380"&gt;@PegaZeus&lt;/a&gt;&amp;nbsp; - 58 million rows isn't large. How about trying - options compress = yes; - to see if that helps. If your base table contains lots of long character variables then compressing will save a lot of space.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;If you are using a SAS server, then I'd suggest it doesn't have enough SAS WORK space configured. Talk to your SAS administrator about this. You may be able to fix this program to use less WORK space, but then you'll&amp;nbsp; run out again in the next big job. Fix the problem, not the symptom.&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Fri, 10 Apr 2020 22:27:04 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Transform-Proc-SQL-to-a-Hash-Object/m-p/639088#M190050</guid>
      <dc:creator>SASKiwi</dc:creator>
      <dc:date>2020-04-10T22:27:04Z</dc:date>
    </item>
    <item>
      <title>Re: Transform Proc SQL to a Hash Object</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Transform-Proc-SQL-to-a-Hash-Object/m-p/639103#M190053</link>
      <description>&lt;P&gt;This join will require a lot of sorting of your Have table/intermediary result sets and unless the inner join doesn't reduce the volumes considerably that's going to be sorting 58M rows multiple times.&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Sorting will create intermediary files on disk so as&amp;nbsp;&lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/13976"&gt;@SASKiwi&lt;/a&gt;&amp;nbsp;suggests using option compress=yes should help to save disk space and I/O and though also improve performance.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;You could avoid at least one sort by moving join...&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;LI-CODE lang="sas"&gt;left join network f on (a.id = f.id)&lt;/LI-CODE&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;...right after the inner join which uses the same key column.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;I believe option magic=103 only works for inner joins.&amp;nbsp;&lt;/P&gt;
&lt;P&gt;A data step hash approach could eventually improve performance but you would need to have sufficient memory for loading all the lookup tables.&amp;nbsp;&lt;/P&gt;
&lt;P&gt;If you run a Proc Contents over your tables then you'll get the information how many rows a table has and the names and lengths of all variables in the table. Sum the lengths of all variables you want, multiply by the number of rows in the table, add 17MB to the result and you'll get an estimate how much memory a hash table will consume.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Using a Select * when joining tables will create warnings if the same variable name exists in more than one table. You should list your variables explicitly.&amp;nbsp;&lt;/P&gt;
&lt;P&gt;You can save yourself some typing by executing below - and then use the expanded variable list in the SAS log as starting point for your select statement.&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;proc sql feedback inobs=1;
/*  create table final as */
  select *
  from have a
  inner join ids b on (a.id = b.id)
  left join nad g on (a.fpc = g.fpc)
  left join pricing e on (a.gpi = e.gpi) and (d.list_id = e.list_id)
  left join network f on (a.id = f.id)
  left join client_segment d on (a.client = d.client) and (a.bob = d.bob) and 
           (a.seg = d.seg) and (a.ex_id = d.ex_id) and (a.aids = d.aids)
  left join instate c on (a.ncp_id = c.ncp_id);
quit;&lt;/CODE&gt;&lt;/PRE&gt;
&lt;P&gt;You can also pre-sort your lookup tables. This won't avoid the sorting of the Have table and the intermediary results, but it will avoid sorting of the lookup tables as part of the SQL.&lt;/P&gt;
&lt;P&gt;So you could use a Proc Sort first for you lookup tables. The SQL won't sort the tables again if it has it is already sorted AND has the sortedby=&amp;lt;variable name(s)) property set (which Proc Sort will do).&lt;/P&gt;
&lt;P&gt;If you know that a table is already sorted the way you need it but the table metadata doesn't have the sortedby attribute set (proc contents will show you) then you can also formulate it explicitly when using the table in the SQL; i.e.&amp;nbsp;&lt;CODE class=" language-sas"&gt;left join network(sortedby=id) f on (a.id = f.id)&lt;/CODE&gt;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;And last but not least:&lt;/P&gt;
&lt;P&gt;Make sure the relationships between table Have and any of the other tables is M:(0 or 1). Any other relationship could increase the row count dramatically and explain why you're running out of workspace.&lt;/P&gt;</description>
      <pubDate>Sun, 12 Apr 2020 04:00:10 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Transform-Proc-SQL-to-a-Hash-Object/m-p/639103#M190053</guid>
      <dc:creator>Patrick</dc:creator>
      <dc:date>2020-04-12T04:00:10Z</dc:date>
    </item>
    <item>
      <title>Re: Transform Proc SQL to a Hash Object</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Transform-Proc-SQL-to-a-Hash-Object/m-p/639130#M190069</link>
      <description>&lt;P&gt;&lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/13976"&gt;@SASKiwi&lt;/a&gt; Utility files are not compressed.&lt;/P&gt;
&lt;P&gt;&lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/320380"&gt;@PegaZeus&lt;/a&gt; You could try 2 steps: Join a and b (and maybe f) on id. This will reduce the number of rows, and the sql parser will know what it's dealing with and will adopt the most efficient method. Then do the other joins against that first table. Don't forget the &lt;STRONG&gt;_method&lt;/STRONG&gt; option to monitor the parser's choices.&lt;/P&gt;</description>
      <pubDate>Sat, 11 Apr 2020 08:23:40 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Transform-Proc-SQL-to-a-Hash-Object/m-p/639130#M190069</guid>
      <dc:creator>ChrisNZ</dc:creator>
      <dc:date>2020-04-11T08:23:40Z</dc:date>
    </item>
    <item>
      <title>Re: Transform Proc SQL to a Hash Object</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Transform-Proc-SQL-to-a-Hash-Object/m-p/639227#M190101</link>
      <description>&lt;P&gt;&lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/16961"&gt;@ChrisNZ&lt;/a&gt;&amp;nbsp;- True, but the code as posted contains temporary SAS datasets where compression may help.&lt;/P&gt;</description>
      <pubDate>Sat, 11 Apr 2020 23:27:13 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Transform-Proc-SQL-to-a-Hash-Object/m-p/639227#M190101</guid>
      <dc:creator>SASKiwi</dc:creator>
      <dc:date>2020-04-11T23:27:13Z</dc:date>
    </item>
    <item>
      <title>Re: Transform Proc SQL to a Hash Object</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Transform-Proc-SQL-to-a-Hash-Object/m-p/639229#M190103</link>
      <description>&lt;P&gt;&lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/13976"&gt;@SASKiwi&lt;/a&gt; I don't see temporary data sets. Can you tell em more?&lt;/P&gt;
&lt;P&gt;Also, one thing that sometimes help is using a compressed folder for utility files.&lt;/P&gt;</description>
      <pubDate>Sat, 11 Apr 2020 23:38:44 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Transform-Proc-SQL-to-a-Hash-Object/m-p/639229#M190103</guid>
      <dc:creator>ChrisNZ</dc:creator>
      <dc:date>2020-04-11T23:38:44Z</dc:date>
    </item>
    <item>
      <title>Re: Transform Proc SQL to a Hash Object</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Transform-Proc-SQL-to-a-Hash-Object/m-p/639232#M190105</link>
      <description>&lt;P&gt;&lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/12447"&gt;@Patrick&lt;/a&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&lt;EM&gt;&amp;gt;You could avoid at least one sort by moving join...&amp;nbsp; ...right after the inner join which uses the same key column.&lt;/EM&gt;&lt;/P&gt;
&lt;P&gt;Oh wow. I'm so disappointed, I was certain the SQL parser did that. You are right it does not.&lt;/P&gt;</description>
      <pubDate>Sun, 12 Apr 2020 00:20:00 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Transform-Proc-SQL-to-a-Hash-Object/m-p/639232#M190105</guid>
      <dc:creator>ChrisNZ</dc:creator>
      <dc:date>2020-04-12T00:20:00Z</dc:date>
    </item>
    <item>
      <title>Re: Transform Proc SQL to a Hash Object</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Transform-Proc-SQL-to-a-Hash-Object/m-p/639237#M190108</link>
      <description>&lt;P&gt;Indexes haven't been mentioned yet.&lt;/P&gt;
&lt;P&gt;It may well be, depending on their size, that tables such as pricing and instate just have to be indexed, to avoid sorting the main table.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Sun, 12 Apr 2020 00:37:39 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Transform-Proc-SQL-to-a-Hash-Object/m-p/639237#M190108</guid>
      <dc:creator>ChrisNZ</dc:creator>
      <dc:date>2020-04-12T00:37:39Z</dc:date>
    </item>
    <item>
      <title>Re: Transform Proc SQL to a Hash Object</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Transform-Proc-SQL-to-a-Hash-Object/m-p/639246#M190117</link>
      <description>&lt;BLOCKQUOTE&gt;&lt;HR /&gt;&lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/16961"&gt;@ChrisNZ&lt;/a&gt;&amp;nbsp;wrote:&lt;BR /&gt;
&lt;P&gt;Indexes haven't been mentioned yet.&lt;/P&gt;
&lt;P&gt;It may well be, depending on their size, that tables such as pricing and instate just have to be indexed, to avoid sorting the main table.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;HR /&gt;&lt;/BLOCKQUOTE&gt;
&lt;P&gt;Thought about it and actually played around with some mock-up data but couldn't make the left joins use the index. Then I found &lt;A href="https://support.sas.com/resources/papers/proceedings/proceedings/sugi29/270-29.pdf" target="_self"&gt;below paper&lt;/A&gt;:&lt;/P&gt;
&lt;P&gt;"&lt;EM&gt;In a LEFT join PROC SQL must pick every row in the LEFT table. If there are any indexes built on the LEFT table,&lt;/EM&gt;&lt;BR /&gt;&lt;EM&gt;SQL cannot use them for a LEFT join. It can make use of indexes on the right table, because only certain rows are&lt;/EM&gt;&lt;BR /&gt;&lt;EM&gt;selected from the RIGHT table – those that match what are in the LEFT table. But, there is no way for PROC SQL to&lt;/EM&gt;&lt;BR /&gt;&lt;EM&gt;use an index built on the LEFT table. So, do not waste your time building and maintaining an index on the LEFT&lt;/EM&gt;&lt;BR /&gt;&lt;EM&gt;table for the purpose of doing a LEFT join&lt;/EM&gt;."&lt;/P&gt;</description>
      <pubDate>Sun, 12 Apr 2020 01:38:24 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Transform-Proc-SQL-to-a-Hash-Object/m-p/639246#M190117</guid>
      <dc:creator>Patrick</dc:creator>
      <dc:date>2020-04-12T01:38:24Z</dc:date>
    </item>
    <item>
      <title>Re: Transform Proc SQL to a Hash Object</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Transform-Proc-SQL-to-a-Hash-Object/m-p/639251#M190122</link>
      <description>&lt;P&gt;&lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/12447"&gt;@Patrick&lt;/a&gt; This makes sense. Hence my suggestion that the right tables be indexed.&lt;/P&gt;
&lt;P&gt;You can force the using index by using option (idxname=).&lt;/P&gt;
&lt;P&gt;I expect that the tables have to be large (especially wide), and the proportion of retrieved rows low, before the index is faster than a sort.&lt;/P&gt;</description>
      <pubDate>Sun, 12 Apr 2020 03:04:59 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Transform-Proc-SQL-to-a-Hash-Object/m-p/639251#M190122</guid>
      <dc:creator>ChrisNZ</dc:creator>
      <dc:date>2020-04-12T03:04:59Z</dc:date>
    </item>
    <item>
      <title>Re: Transform Proc SQL to a Hash Object</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Transform-Proc-SQL-to-a-Hash-Object/m-p/639253#M190124</link>
      <description>&lt;BLOCKQUOTE&gt;&lt;HR /&gt;&lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/320380"&gt;@PegaZeus&lt;/a&gt;&amp;nbsp;wrote:&lt;BR /&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;proc sql magic=103;
  create table final as 
  select *
  from have a
  inner join ids b on (a.id = b.id)
  left join nad g on (a.fpc = g.fpc)
  left join pricing e on (a.gpi = e.gpi) and (d.list_id = e.list_id)
  left join network f on (a.id = f.id)
  left join client_segment d on (a.client = d.client) and (a.bob = d.bob) and 
           (a.seg = d.seg) and (a.ex_id = d.ex_id) and (a.aids = d.aids)
  left join instate c on (a.ncp_id = c.ncp_id);
quit;&lt;/CODE&gt;&lt;/PRE&gt;
&lt;HR /&gt;&lt;/BLOCKQUOTE&gt;
&lt;P&gt;Have a look at:&lt;/P&gt;
&lt;P&gt;&lt;A href="https://github.com/scottbass/SAS/blob/master/Macro/hash_define.sas" target="_blank"&gt;https://github.com/scottbass/SAS/blob/master/Macro/hash_define.sas&lt;/A&gt;&lt;/P&gt;
&lt;P&gt;&lt;A href="https://github.com/scottbass/SAS/blob/master/Macro/hash_lookup.sas" target="_blank"&gt;https://github.com/scottbass/SAS/blob/master/Macro/hash_lookup.sas&lt;/A&gt;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;You'll also need the utility macros:&lt;/P&gt;
&lt;P&gt;&lt;A href="https://github.com/scottbass/SAS/blob/master/Macro/parmv.sas" target="_blank"&gt;https://github.com/scottbass/SAS/blob/master/Macro/parmv.sas&lt;/A&gt;&lt;/P&gt;
&lt;P&gt;&lt;A href="https://github.com/scottbass/SAS/blob/master/Macro/parmv.sas" target="_blank"&gt;https://github.com/scottbass/SAS/blob/master/Macro/parmv.sas&lt;/A&gt;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Read the hash_define macro header thoroughly.&amp;nbsp; Run the self-contained examples.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;You can simulate an inner join by filtering (subsetting if statement) on the return code from the hash object lookup.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Given your join criteria the order of your lookups is important.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Hopefully your dimension tables will fit into the memory available to your SAS session, otherwise you'll need to split the code into multiple datasteps.&amp;nbsp; Alternatively, you can create indexes on your dimension tables and use an index key lookup ("double set statement" - see the documentation on the SET statement, esp. the examples showing transaction datasets).&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Give it a go and post followup questions.&lt;/P&gt;</description>
      <pubDate>Sun, 12 Apr 2020 03:42:39 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Transform-Proc-SQL-to-a-Hash-Object/m-p/639253#M190124</guid>
      <dc:creator>ScottBass</dc:creator>
      <dc:date>2020-04-12T03:42:39Z</dc:date>
    </item>
    <item>
      <title>Re: Transform Proc SQL to a Hash Object</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Transform-Proc-SQL-to-a-Hash-Object/m-p/639258#M190127</link>
      <description>&lt;BLOCKQUOTE&gt;&lt;HR /&gt;&lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/16961"&gt;@ChrisNZ&lt;/a&gt;&amp;nbsp;wrote:&lt;BR /&gt;
&lt;P&gt;&lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/12447"&gt;@Patrick&lt;/a&gt; This makes sense. Hence my suggestion that the right tables be indexed.&lt;/P&gt;
&lt;P&gt;You can force the using index by using option (idxname=).&lt;/P&gt;
&lt;P&gt;I expect that the tables have to be large (especially wide), and the proportion of retrieved rows low, before the index is faster than a sort.&lt;/P&gt;
&lt;HR /&gt;&lt;/BLOCKQUOTE&gt;
&lt;P&gt;Hmm... I can see how an index on the right table could get used for the first inner join. But isn't then already the first left join executing against the result set of the inner join and not against the right table? So how could an index on the right table get used here? Can you show me using below sample code?&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;data 
  Main(keep=m id id2) 
  Inner(keep=i id) 
  L1(keep=l1 id) 
  L2(keep=l2 id2) 
  ;
  length id id2 8;
  m='master';
  i='inner';
  l1='left 1';
  l2='left 2';
  do id=1 to 1000;
    id2=100*id;
    output main;
    if mod(id,2)=0 then output inner;
    if mod(id,4)=0 then output l1;
    if mod(id,8)=0 then output l2;
  end;
run;

proc sql;
  create index id on main(id);
quit;

proc sql _method;
  create table want as
    select m.id, m.id2,m.m, i.i, l1.l1, l2.l2 
    from main(idxname=id) m
    inner join inner i on m.id=i.id
    left join l1 on m.id=l1.id
    left join l2 on m.id2=l2.id2
  ;
quit; &lt;/CODE&gt;&lt;/PRE&gt;
&lt;PRE&gt;      sqxcrta
          sqxjm
              sqxsort
                  sqxsrc( WORK.L2 )
              sqxsort
                  sqxjm
                      sqxsort
                          sqxsrc( WORK.L1 )
                      sqxsort
                          sqxjndx
                              sqxsrc( WORK.INNER(alias = I) )
                              sqxsrc( WORK.MAIN(alias = M) )&lt;/PRE&gt;</description>
      <pubDate>Sun, 12 Apr 2020 04:33:54 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Transform-Proc-SQL-to-a-Hash-Object/m-p/639258#M190127</guid>
      <dc:creator>Patrick</dc:creator>
      <dc:date>2020-04-12T04:33:54Z</dc:date>
    </item>
    <item>
      <title>Re: Transform Proc SQL to a Hash Object</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Transform-Proc-SQL-to-a-Hash-Object/m-p/639268#M190133</link>
      <description>&lt;P&gt;&lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/12447"&gt;@Patrick&lt;/a&gt; Not sure I understand the question.&lt;/P&gt;
&lt;P&gt;The index is used to retrieve date from the &lt;EM&gt;right&lt;/EM&gt; table B.&lt;/P&gt;
&lt;P&gt;Once that data is there, there's nothing else to retrieve from B.&lt;/P&gt;
&lt;P&gt;However the B data can be used to fetch the data from C, using an index again.&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;data CLASS(index=(NAME));
  set SASHELP.CLASS;
run;
options msglevel=i;
proc sql _method;
  select a.NAME, b.SEX, c.AGE
  from        CLASS               a
    left join CLASS(idxname=NAME) b on a.NAME=b.NAME
    left join CLASS(idxname=NAME) c on b.NAME=c.NAME;      &lt;/CODE&gt;&lt;/PRE&gt;
&lt;DIV class="sasSource" style="color: #000000; font-family: Consolas, Courier, 'Courier New'; font-size: 14px; font-style: normal; font-variant-ligatures: normal; font-variant-caps: normal; font-weight: 400; letter-spacing: normal; orphans: 2; text-align: start; text-indent: 0px; text-transform: none; white-space: pre; widows: 2; word-spacing: 0px; -webkit-text-stroke-width: 0px; background-color: #ffffff; text-decoration-style: initial; text-decoration-color: initial;"&gt;76 options msglevel=i;&lt;/DIV&gt;
&lt;DIV class="sasSource" style="color: #000000; font-family: Consolas, Courier, 'Courier New'; font-size: 14px; font-style: normal; font-variant-ligatures: normal; font-variant-caps: normal; font-weight: 400; letter-spacing: normal; orphans: 2; text-align: start; text-indent: 0px; text-transform: none; white-space: pre; widows: 2; word-spacing: 0px; -webkit-text-stroke-width: 0px; background-color: #ffffff; text-decoration-style: initial; text-decoration-color: initial;"&gt;77 proc sql _method;&lt;/DIV&gt;
&lt;DIV class="sasSource" style="color: #000000; font-family: Consolas, Courier, 'Courier New'; font-size: 14px; font-style: normal; font-variant-ligatures: normal; font-variant-caps: normal; font-weight: 400; letter-spacing: normal; orphans: 2; text-align: start; text-indent: 0px; text-transform: none; white-space: pre; widows: 2; word-spacing: 0px; -webkit-text-stroke-width: 0px; background-color: #ffffff; text-decoration-style: initial; text-decoration-color: initial;"&gt;78 select a.NAME, b.SEX, c.AGE&lt;/DIV&gt;
&lt;DIV class="sasSource" style="color: #000000; font-family: Consolas, Courier, 'Courier New'; font-size: 14px; font-style: normal; font-variant-ligatures: normal; font-variant-caps: normal; font-weight: 400; letter-spacing: normal; orphans: 2; text-align: start; text-indent: 0px; text-transform: none; white-space: pre; widows: 2; word-spacing: 0px; -webkit-text-stroke-width: 0px; background-color: #ffffff; text-decoration-style: initial; text-decoration-color: initial;"&gt;79 from CLASS a&lt;/DIV&gt;
&lt;DIV class="sasSource" style="color: #000000; font-family: Consolas, Courier, 'Courier New'; font-size: 14px; font-style: normal; font-variant-ligatures: normal; font-variant-caps: normal; font-weight: 400; letter-spacing: normal; orphans: 2; text-align: start; text-indent: 0px; text-transform: none; white-space: pre; widows: 2; word-spacing: 0px; -webkit-text-stroke-width: 0px; background-color: #ffffff; text-decoration-style: initial; text-decoration-color: initial;"&gt;80 left join CLASS(idxname=NAME) b on a.NAME=b.NAME&lt;/DIV&gt;
&lt;DIV class="sasSource" style="color: #000000; font-family: Consolas, Courier, 'Courier New'; font-size: 14px; font-style: normal; font-variant-ligatures: normal; font-variant-caps: normal; font-weight: 400; letter-spacing: normal; orphans: 2; text-align: start; text-indent: 0px; text-transform: none; white-space: pre; widows: 2; word-spacing: 0px; -webkit-text-stroke-width: 0px; background-color: #ffffff; text-decoration-style: initial; text-decoration-color: initial;"&gt;81 left join CLASS(idxname=NAME) c on b.NAME=c.NAME;&lt;/DIV&gt;
&lt;DIV class="sasSource" style="color: #000000; font-family: Consolas, Courier, 'Courier New'; font-size: 14px; font-style: normal; font-variant-ligatures: normal; font-variant-caps: normal; font-weight: 400; letter-spacing: normal; orphans: 2; text-align: start; text-indent: 0px; text-transform: none; white-space: pre; widows: 2; word-spacing: 0px; -webkit-text-stroke-width: 0px; background-color: #ffffff; text-decoration-style: initial; text-decoration-color: initial;"&gt;INFO: Index Name of SQL table WORK.CLASS (alias = B) selected for SQL WHERE clause (outer join) optimization.&lt;/DIV&gt;
&lt;DIV class="sasSource" style="color: #000000; font-family: Consolas, Courier, 'Courier New'; font-size: 14px; font-style: normal; font-variant-ligatures: normal; font-variant-caps: normal; font-weight: 400; letter-spacing: normal; orphans: 2; text-align: start; text-indent: 0px; text-transform: none; white-space: pre; widows: 2; word-spacing: 0px; -webkit-text-stroke-width: 0px; background-color: #ffffff; text-decoration-style: initial; text-decoration-color: initial;"&gt;INFO: Index Name of SQL table WORK.CLASS (alias = C) selected for SQL WHERE clause (outer join) optimization.&lt;/DIV&gt;
&lt;DIV class="sasSource" style="color: #000000; font-family: Consolas, Courier, 'Courier New'; font-size: 14px; font-style: normal; font-variant-ligatures: normal; font-variant-caps: normal; font-weight: 400; letter-spacing: normal; orphans: 2; text-align: start; text-indent: 0px; text-transform: none; white-space: pre; widows: 2; word-spacing: 0px; -webkit-text-stroke-width: 0px; background-color: #ffffff; text-decoration-style: initial; text-decoration-color: initial;"&gt;&amp;nbsp;&lt;/DIV&gt;
&lt;DIV id="sasLogNote5_1586675035190" class="sasNote" style="color: blue; font-family: Consolas, Courier, 'Courier New'; font-size: 14px; font-style: normal; font-variant-ligatures: normal; font-variant-caps: normal; font-weight: 400; letter-spacing: normal; orphans: 2; text-align: start; text-indent: 0px; text-transform: none; white-space: pre; widows: 2; word-spacing: 0px; -webkit-text-stroke-width: 0px; background-color: #ffffff; text-decoration-style: initial; text-decoration-color: initial;"&gt;NOTE: SQL execution methods chosen are:&lt;/DIV&gt;
&lt;DIV class="sasNote" style="color: blue; font-family: Consolas, Courier, 'Courier New'; font-size: 14px; font-style: normal; font-variant-ligatures: normal; font-variant-caps: normal; font-weight: 400; letter-spacing: normal; orphans: 2; text-align: start; text-indent: 0px; text-transform: none; white-space: pre; widows: 2; word-spacing: 0px; -webkit-text-stroke-width: 0px; background-color: #ffffff; text-decoration-style: initial; text-decoration-color: initial;"&gt;&amp;nbsp;&lt;/DIV&gt;
&lt;DIV class="sasSource" style="color: #000000; font-family: Consolas, Courier, 'Courier New'; font-size: 14px; font-style: normal; font-variant-ligatures: normal; font-variant-caps: normal; font-weight: 400; letter-spacing: normal; orphans: 2; text-align: start; text-indent: 0px; text-transform: none; white-space: pre; widows: 2; word-spacing: 0px; -webkit-text-stroke-width: 0px; background-color: #ffffff; text-decoration-style: initial; text-decoration-color: initial;"&gt;sqxslct&lt;/DIV&gt;
&lt;DIV class="sasSource" style="color: #000000; font-family: Consolas, Courier, 'Courier New'; font-size: 14px; font-style: normal; font-variant-ligatures: normal; font-variant-caps: normal; font-weight: 400; letter-spacing: normal; orphans: 2; text-align: start; text-indent: 0px; text-transform: none; white-space: pre; widows: 2; word-spacing: 0px; -webkit-text-stroke-width: 0px; background-color: #ffffff; text-decoration-style: initial; text-decoration-color: initial;"&gt;sqxjndx&lt;/DIV&gt;
&lt;DIV class="sasSource" style="color: #000000; font-family: Consolas, Courier, 'Courier New'; font-size: 14px; font-style: normal; font-variant-ligatures: normal; font-variant-caps: normal; font-weight: 400; letter-spacing: normal; orphans: 2; text-align: start; text-indent: 0px; text-transform: none; white-space: pre; widows: 2; word-spacing: 0px; -webkit-text-stroke-width: 0px; background-color: #ffffff; text-decoration-style: initial; text-decoration-color: initial;"&gt;sqxjndx&lt;/DIV&gt;
&lt;DIV class="sasSource" style="color: #000000; font-family: Consolas, Courier, 'Courier New'; font-size: 14px; font-style: normal; font-variant-ligatures: normal; font-variant-caps: normal; font-weight: 400; letter-spacing: normal; orphans: 2; text-align: start; text-indent: 0px; text-transform: none; white-space: pre; widows: 2; word-spacing: 0px; -webkit-text-stroke-width: 0px; background-color: #ffffff; text-decoration-style: initial; text-decoration-color: initial;"&gt;sqxsrc( WORK.CLASS(alias = A) )&lt;/DIV&gt;
&lt;DIV class="sasSource" style="color: #000000; font-family: Consolas, Courier, 'Courier New'; font-size: 14px; font-style: normal; font-variant-ligatures: normal; font-variant-caps: normal; font-weight: 400; letter-spacing: normal; orphans: 2; text-align: start; text-indent: 0px; text-transform: none; white-space: pre; widows: 2; word-spacing: 0px; -webkit-text-stroke-width: 0px; background-color: #ffffff; text-decoration-style: initial; text-decoration-color: initial;"&gt;sqxsrc( WORK.CLASS(alias = B) )&lt;/DIV&gt;
&lt;DIV class="sasSource" style="color: #000000; font-family: Consolas, Courier, 'Courier New'; font-size: 14px; font-style: normal; font-variant-ligatures: normal; font-variant-caps: normal; font-weight: 400; letter-spacing: normal; orphans: 2; text-align: start; text-indent: 0px; text-transform: none; white-space: pre; widows: 2; word-spacing: 0px; -webkit-text-stroke-width: 0px; background-color: #ffffff; text-decoration-style: initial; text-decoration-color: initial;"&gt;sqxsrc( WORK.CLASS(alias = C) )&lt;/DIV&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Sun, 12 Apr 2020 08:03:03 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Transform-Proc-SQL-to-a-Hash-Object/m-p/639268#M190133</guid>
      <dc:creator>ChrisNZ</dc:creator>
      <dc:date>2020-04-12T08:03:03Z</dc:date>
    </item>
    <item>
      <title>Re: Transform Proc SQL to a Hash Object</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Transform-Proc-SQL-to-a-Hash-Object/m-p/639413#M190238</link>
      <description>&lt;P&gt;Assuming there is a column "name" in dataset "ids", this does the first part of your SQL:&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;data final;
set have;
if _n_ = 1
then do;
  length name $20; /* set attributes of data variables in the hash */
  declare hash ids(dataset:"ids");
  ids.definekey('id');
  ids.definedata('name'); /* insert more columns as needed */
  ids.definedone();
  call missing(name); /* prevents "uninitialized" NOTE */
end;
if ids.find() = 0; /* does two things:
  - fetches data coming from dataset ids
  - proceeds with the data step iteration only if match is found
    (does the "inner join")
*/
run;&lt;/CODE&gt;&lt;/PRE&gt;
&lt;P&gt;Expand this as needed.&lt;/P&gt;
&lt;P&gt;For more detailed assistance, show the structures of all the datasets in your SQL.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&lt;EM&gt;&lt;FONT size="2"&gt;Edited according to answer by&amp;nbsp;&lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/15043"&gt;@ScottBass&lt;/a&gt;&amp;nbsp;&lt;/FONT&gt;&lt;/EM&gt;&lt;/P&gt;</description>
      <pubDate>Mon, 13 Apr 2020 10:11:40 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Transform-Proc-SQL-to-a-Hash-Object/m-p/639413#M190238</guid>
      <dc:creator>Kurt_Bremser</dc:creator>
      <dc:date>2020-04-13T10:11:40Z</dc:date>
    </item>
    <item>
      <title>Re: Transform Proc SQL to a Hash Object</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Transform-Proc-SQL-to-a-Hash-Object/m-p/639422#M190241</link>
      <description>&lt;P&gt;There is an implied retain statement for data set variables.&amp;nbsp; Therefore the "if 0 then set ids" construct will cause a retain on all variables in ids.&amp;nbsp; Of particular interest are your definedata() variables, eg. "name".&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Since ids is never read by a set statement, any failed find() method will cause the value from the previous successful find to be carried forward in the PDV.&amp;nbsp; Probably not what you want.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;I recommend a call missing(of _all_) or at least call missing(of list,of,hash,object,satellite,variables) at the appropriate point in your data step.&lt;/P&gt;</description>
      <pubDate>Mon, 13 Apr 2020 09:01:46 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Transform-Proc-SQL-to-a-Hash-Object/m-p/639422#M190241</guid>
      <dc:creator>ScottBass</dc:creator>
      <dc:date>2020-04-13T09:01:46Z</dc:date>
    </item>
    <item>
      <title>Re: Transform Proc SQL to a Hash Object</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Transform-Proc-SQL-to-a-Hash-Object/m-p/639427#M190244</link>
      <description>&lt;P&gt;You are absolutely right. In this particular case, though, it would not be of consequence, as the subsetting if lets only observations through&amp;nbsp;where a value is found.&lt;/P&gt;
&lt;P&gt;But I will edit my post, as the other ("left join") hashes would suffer from this effect.&lt;/P&gt;</description>
      <pubDate>Mon, 13 Apr 2020 10:08:59 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Transform-Proc-SQL-to-a-Hash-Object/m-p/639427#M190244</guid>
      <dc:creator>Kurt_Bremser</dc:creator>
      <dc:date>2020-04-13T10:08:59Z</dc:date>
    </item>
  </channel>
</rss>

