<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: PROC SQL JOINS PERFORMANCE in SAS Procedures</title>
    <link>https://communities.sas.com/t5/SAS-Procedures/PROC-SQL-JOINS-PERFORMANCE/m-p/22273#M4860</link>
    <description>&lt;HTML&gt;&lt;HEAD&gt;&lt;/HEAD&gt;&lt;BODY&gt;&lt;P&gt;Hi,&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;i am having 100 records and duplicates are possible in my test data set.Can you refer to me a best strategy&lt;/P&gt;&lt;/BODY&gt;&lt;/HTML&gt;</description>
    <pubDate>Mon, 05 Mar 2012 20:54:41 GMT</pubDate>
    <dc:creator>JasonNC</dc:creator>
    <dc:date>2012-03-05T20:54:41Z</dc:date>
    <item>
      <title>PROC SQL JOINS PERFORMANCE</title>
      <link>https://communities.sas.com/t5/SAS-Procedures/PROC-SQL-JOINS-PERFORMANCE/m-p/22263#M4850</link>
      <description>&lt;HTML&gt;&lt;HEAD&gt;&lt;/HEAD&gt;&lt;BODY&gt;&lt;P&gt;Hi,&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;Does using left joins on Non-Key fields when we have huge data creates a performance issue.If it creates then what is the best way to over come that.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;test is my data set in SAS and the remaining tables are from data base.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;ID AND ID1 are two different Id's and so i called the data base table again in the same query. &lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;For example&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;Proc Sql; &lt;/P&gt;&lt;P&gt;&amp;nbsp; create table test as&lt;/P&gt;&lt;P&gt;select t1.*,t2.Name,t3.Address&lt;/P&gt;&lt;P&gt;From test AS t1 LEFT JOIN DB2_CON.. Name_table AS t2 ON (t1.Id=t2.Id)&lt;/P&gt;&lt;P&gt;LEFT JOIN DB2_CON.. Address_table AS t3 ON (t2.state_id=t3.State_id)&lt;/P&gt;&lt;P&gt;LEFT JOIN DB2_CON.. Name_table AS t4 ON (t1.id1=t2.Id1)&lt;/P&gt;&lt;P&gt;LEFT JOIN DB2_CON..Address_table AS t5 ON (t2.state_id=t5.State_id);&lt;/P&gt;&lt;P&gt;QUIT;&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt; I hope it is clear and i appreciate your help.&lt;/P&gt;&lt;/BODY&gt;&lt;/HTML&gt;</description>
      <pubDate>Mon, 05 Mar 2012 15:17:51 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Procedures/PROC-SQL-JOINS-PERFORMANCE/m-p/22263#M4850</guid>
      <dc:creator>JasonNC</dc:creator>
      <dc:date>2012-03-05T15:17:51Z</dc:date>
    </item>
    <item>
      <title>PROC SQL JOINS PERFORMANCE</title>
      <link>https://communities.sas.com/t5/SAS-Procedures/PROC-SQL-JOINS-PERFORMANCE/m-p/22264#M4851</link>
      <description>&lt;HTML&gt;&lt;HEAD&gt;&lt;/HEAD&gt;&lt;BODY&gt;&lt;P&gt; Hi,&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;Do you have indexes created on the keys you are comparing against?&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;That should help, especially if the # of rows in each table is significantly different.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;--Ben&lt;/P&gt;&lt;/BODY&gt;&lt;/HTML&gt;</description>
      <pubDate>Mon, 05 Mar 2012 16:15:13 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Procedures/PROC-SQL-JOINS-PERFORMANCE/m-p/22264#M4851</guid>
      <dc:creator>BenConner</dc:creator>
      <dc:date>2012-03-05T16:15:13Z</dc:date>
    </item>
    <item>
      <title>Re: PROC SQL JOINS PERFORMANCE</title>
      <link>https://communities.sas.com/t5/SAS-Procedures/PROC-SQL-JOINS-PERFORMANCE/m-p/22265#M4852</link>
      <description>&lt;HTML&gt;&lt;HEAD&gt;&lt;/HEAD&gt;&lt;BODY&gt;&lt;P&gt;I don't understand why you wouldn't use :&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;Proc Sql; &lt;BR /&gt;create table test as&lt;BR /&gt;select t1.*, t2.Name, t3.Address&lt;BR /&gt;From test AS t1 &lt;BR /&gt;LEFT JOIN &lt;STRONG style="color: #ff0000;"&gt;DB2_CON.. &lt;/STRONG&gt;Name_table AS t2 ON ((t1.Id=t2.Id) or (t1.id1=t2.id1))&lt;BR /&gt;LEFT JOIN &lt;STRONG style="color: #ff0000;"&gt;DB2_CON.. &lt;/STRONG&gt;Address_table AS t3 ON (t2.state_id=t3.State_id);&lt;BR /&gt;QUIT;&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;and cut the number of joins by half. I don't understand the syntax in red either...&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;PG&lt;/P&gt;&lt;/BODY&gt;&lt;/HTML&gt;</description>
      <pubDate>Mon, 05 Mar 2012 16:17:33 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Procedures/PROC-SQL-JOINS-PERFORMANCE/m-p/22265#M4852</guid>
      <dc:creator>PGStats</dc:creator>
      <dc:date>2012-03-05T16:17:33Z</dc:date>
    </item>
    <item>
      <title>Re: PROC SQL JOINS PERFORMANCE</title>
      <link>https://communities.sas.com/t5/SAS-Procedures/PROC-SQL-JOINS-PERFORMANCE/m-p/22266#M4853</link>
      <description>&lt;HTML&gt;&lt;HEAD&gt;&lt;/HEAD&gt;&lt;BODY&gt;&lt;P&gt;I was having the same issue with performance. It takes hours and hours for my code to run. When I started creating hash tables in SAS, the performance has been increased. Normally my left joins takes 3 - 4 hours to run that has about 50 millions of data but now with hash tables it takes about half hour to run.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;Hash tables are best way to perform joins and I love it. It has some delimitations as well. &lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;&lt;A href="http://www2.sas.com/proceedings/forum2008/029-2008.pdf"&gt;http://www2.sas.com/proceedings/forum2008/029-2008.pdf&lt;/A&gt;&lt;/P&gt;&lt;/BODY&gt;&lt;/HTML&gt;</description>
      <pubDate>Mon, 05 Mar 2012 16:55:39 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Procedures/PROC-SQL-JOINS-PERFORMANCE/m-p/22266#M4853</guid>
      <dc:creator>Hima</dc:creator>
      <dc:date>2012-03-05T16:55:39Z</dc:date>
    </item>
    <item>
      <title>Re: PROC SQL JOINS PERFORMANCE</title>
      <link>https://communities.sas.com/t5/SAS-Procedures/PROC-SQL-JOINS-PERFORMANCE/m-p/22267#M4854</link>
      <description>&lt;HTML&gt;&lt;HEAD&gt;&lt;/HEAD&gt;&lt;BODY&gt;&lt;P&gt;Hi,&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;The things in red are the connectors to the Data base.As far as the jons i can't use OR as there are different id's and they both will be there .I forgot to keep Name1 and Name 2. Address 1 and Address 2.It's my bad&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P style="background-color: #eef4f9;"&gt;Proc Sql;&lt;/P&gt;&lt;P style="background-color: #eef4f9;"&gt;&amp;nbsp; create table test as&lt;/P&gt;&lt;P style="background-color: #eef4f9;"&gt;select t1.*,t2.Name1 t2.Name2,t3.Address1 and t3.Address2&lt;/P&gt;&lt;P style="background-color: #eef4f9;"&gt;From test AS t1 LEFT JOIN DB2_CON.. Name_table AS t2 ON (t1.Id=t2.Id)&lt;/P&gt;&lt;P style="background-color: #eef4f9;"&gt;LEFT JOIN DB2_CON.. Address_table AS t3 ON (t2.state_id=t3.State_id)&lt;/P&gt;&lt;P style="background-color: #eef4f9;"&gt;LEFT JOIN DB2_CON.. Name_table AS t4 ON (t1.id1=t2.Id1)&lt;/P&gt;&lt;P style="background-color: #eef4f9;"&gt;LEFT JOIN DB2_CON..Address_table AS t5 ON (t2.state_id=t5.State_id);&lt;/P&gt;&lt;P style="background-color: #eef4f9;"&gt;QUIT;&lt;/P&gt;&lt;/BODY&gt;&lt;/HTML&gt;</description>
      <pubDate>Mon, 05 Mar 2012 17:18:37 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Procedures/PROC-SQL-JOINS-PERFORMANCE/m-p/22267#M4854</guid>
      <dc:creator>JasonNC</dc:creator>
      <dc:date>2012-03-05T17:18:37Z</dc:date>
    </item>
    <item>
      <title>Re: PROC SQL JOINS PERFORMANCE</title>
      <link>https://communities.sas.com/t5/SAS-Procedures/PROC-SQL-JOINS-PERFORMANCE/m-p/22268#M4855</link>
      <description>&lt;HTML&gt;&lt;HEAD&gt;&lt;/HEAD&gt;&lt;BODY&gt;&lt;P&gt;Hi Hima,&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;Thx fo ryou reply.I have one questions do&amp;nbsp; you use hashing in every scenario or when one data set is small and the other one is large.&lt;/P&gt;&lt;/BODY&gt;&lt;/HTML&gt;</description>
      <pubDate>Mon, 05 Mar 2012 17:21:56 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Procedures/PROC-SQL-JOINS-PERFORMANCE/m-p/22268#M4855</guid>
      <dc:creator>JasonNC</dc:creator>
      <dc:date>2012-03-05T17:21:56Z</dc:date>
    </item>
    <item>
      <title>Re: PROC SQL JOINS PERFORMANCE</title>
      <link>https://communities.sas.com/t5/SAS-Procedures/PROC-SQL-JOINS-PERFORMANCE/m-p/22269#M4856</link>
      <description>&lt;HTML&gt;&lt;HEAD&gt;&lt;/HEAD&gt;&lt;BODY&gt;&lt;P&gt;I have created hash tables in every scenario and it worked fine for me. The results matched with the proc sql left join results. Over all the run time has reduced.&lt;/P&gt;&lt;/BODY&gt;&lt;/HTML&gt;</description>
      <pubDate>Mon, 05 Mar 2012 17:30:40 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Procedures/PROC-SQL-JOINS-PERFORMANCE/m-p/22269#M4856</guid>
      <dc:creator>Hima</dc:creator>
      <dc:date>2012-03-05T17:30:40Z</dc:date>
    </item>
    <item>
      <title>Re: PROC SQL JOINS PERFORMANCE</title>
      <link>https://communities.sas.com/t5/SAS-Procedures/PROC-SQL-JOINS-PERFORMANCE/m-p/22270#M4857</link>
      <description>&lt;HTML&gt;&lt;HEAD&gt;&lt;/HEAD&gt;&lt;BODY&gt;&lt;P&gt; One option to consider is pushing test onto the server in a temporary table as explained in :&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;&lt;A href="http://support.sas.com/documentation/cdl/en/acreldb/63647/HTML/default/viewer.htm#a002677163.htm"&gt;http://support.sas.com/documentation/cdl/en/acreldb/63647/HTML/default/viewer.htm#a002677163.htm&lt;/A&gt;&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;and have the DB2 server do the joins.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;PG&lt;/P&gt;&lt;/BODY&gt;&lt;/HTML&gt;</description>
      <pubDate>Mon, 05 Mar 2012 17:32:05 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Procedures/PROC-SQL-JOINS-PERFORMANCE/m-p/22270#M4857</guid>
      <dc:creator>PGStats</dc:creator>
      <dc:date>2012-03-05T17:32:05Z</dc:date>
    </item>
    <item>
      <title>Re: PROC SQL JOINS PERFORMANCE</title>
      <link>https://communities.sas.com/t5/SAS-Procedures/PROC-SQL-JOINS-PERFORMANCE/m-p/22271#M4858</link>
      <description>&lt;HTML&gt;&lt;HEAD&gt;&lt;/HEAD&gt;&lt;BODY&gt;&lt;P&gt;The reason for the slow performance is SAS has to read all of your selected data out of DB2 into the SAS environment BEFORE it can compare it with your TEST dataset. In extreme examples you could be reading millions of records out of DB2 to join to a handful of rows in SAS.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;There are a number of strategies for dealing with this that have been covered in previous posts. PGStat's method is one of these. How many distinct IDs are in your TEST dataset?&lt;/P&gt;&lt;/BODY&gt;&lt;/HTML&gt;</description>
      <pubDate>Mon, 05 Mar 2012 19:16:51 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Procedures/PROC-SQL-JOINS-PERFORMANCE/m-p/22271#M4858</guid>
      <dc:creator>SASKiwi</dc:creator>
      <dc:date>2012-03-05T19:16:51Z</dc:date>
    </item>
    <item>
      <title>PROC SQL JOINS PERFORMANCE</title>
      <link>https://communities.sas.com/t5/SAS-Procedures/PROC-SQL-JOINS-PERFORMANCE/m-p/22272#M4859</link>
      <description>&lt;HTML&gt;&lt;HEAD&gt;&lt;/HEAD&gt;&lt;BODY&gt;&lt;P&gt; use proc sql pass through&lt;/P&gt;&lt;/BODY&gt;&lt;/HTML&gt;</description>
      <pubDate>Mon, 05 Mar 2012 19:40:54 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Procedures/PROC-SQL-JOINS-PERFORMANCE/m-p/22272#M4859</guid>
      <dc:creator>sassharp</dc:creator>
      <dc:date>2012-03-05T19:40:54Z</dc:date>
    </item>
    <item>
      <title>Re: PROC SQL JOINS PERFORMANCE</title>
      <link>https://communities.sas.com/t5/SAS-Procedures/PROC-SQL-JOINS-PERFORMANCE/m-p/22273#M4860</link>
      <description>&lt;HTML&gt;&lt;HEAD&gt;&lt;/HEAD&gt;&lt;BODY&gt;&lt;P&gt;Hi,&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;i am having 100 records and duplicates are possible in my test data set.Can you refer to me a best strategy&lt;/P&gt;&lt;/BODY&gt;&lt;/HTML&gt;</description>
      <pubDate>Mon, 05 Mar 2012 20:54:41 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Procedures/PROC-SQL-JOINS-PERFORMANCE/m-p/22273#M4860</guid>
      <dc:creator>JasonNC</dc:creator>
      <dc:date>2012-03-05T20:54:41Z</dc:date>
    </item>
    <item>
      <title>Re: PROC SQL JOINS PERFORMANCE</title>
      <link>https://communities.sas.com/t5/SAS-Procedures/PROC-SQL-JOINS-PERFORMANCE/m-p/22274#M4861</link>
      <description>&lt;HTML&gt;&lt;HEAD&gt;&lt;/HEAD&gt;&lt;BODY&gt;&lt;P&gt;Only 100 records, well then, how about never joining test with the DB2 tables? Very low tech :&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;Proc Sql;&lt;/P&gt;&lt;P&gt;select id into :id seperated by ","&lt;BR /&gt;from test where id is not missing;&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;select id1 into :id1 seperated by ","&lt;BR /&gt;from test where id1 is not missing;&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;create table ids as&lt;BR /&gt;select T2.*, T3.Address1 from DB2_CON.Name_table AS t2 &lt;BR /&gt;LEFT JOIN DB2_CON.Address_table AS t3 ON (t2.state_id=t3.State_id)&lt;BR /&gt;where T2.id in (&amp;amp;id.);&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;create table id1s ...&lt;/P&gt;&lt;P&gt;you get the idea. And then join local tables test, ids and id1s...&lt;/P&gt;&lt;P&gt;&lt;BR /&gt;QUIT;&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;PG&lt;/P&gt;&lt;/BODY&gt;&lt;/HTML&gt;</description>
      <pubDate>Mon, 05 Mar 2012 21:30:49 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Procedures/PROC-SQL-JOINS-PERFORMANCE/m-p/22274#M4861</guid>
      <dc:creator>PGStats</dc:creator>
      <dc:date>2012-03-05T21:30:49Z</dc:date>
    </item>
    <item>
      <title>PROC SQL JOINS PERFORMANCE</title>
      <link>https://communities.sas.com/t5/SAS-Procedures/PROC-SQL-JOINS-PERFORMANCE/m-p/22275#M4862</link>
      <description>&lt;HTML&gt;&lt;HEAD&gt;&lt;/HEAD&gt;&lt;BODY&gt;&lt;P&gt; try to use dbkey option in your code. it definitely imroves performance when you are joining sas data set and dbms table.&lt;/P&gt;&lt;/BODY&gt;&lt;/HTML&gt;</description>
      <pubDate>Mon, 05 Mar 2012 21:36:33 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Procedures/PROC-SQL-JOINS-PERFORMANCE/m-p/22275#M4862</guid>
      <dc:creator>sassharp</dc:creator>
      <dc:date>2012-03-05T21:36:33Z</dc:date>
    </item>
    <item>
      <title>PROC SQL JOINS PERFORMANCE</title>
      <link>https://communities.sas.com/t5/SAS-Procedures/PROC-SQL-JOINS-PERFORMANCE/m-p/22276#M4863</link>
      <description>&lt;HTML&gt;&lt;HEAD&gt;&lt;/HEAD&gt;&lt;BODY&gt;&lt;P&gt;Using DBkey might create incorrect results sometimes. &lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;&lt;A href="http://support.sas.com/kb/42/979.html"&gt;http://support.sas.com/kb/42/979.html&lt;/A&gt;&lt;/P&gt;&lt;/BODY&gt;&lt;/HTML&gt;</description>
      <pubDate>Mon, 05 Mar 2012 21:45:07 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Procedures/PROC-SQL-JOINS-PERFORMANCE/m-p/22276#M4863</guid>
      <dc:creator>Hima</dc:creator>
      <dc:date>2012-03-05T21:45:07Z</dc:date>
    </item>
    <item>
      <title>PROC SQL JOINS PERFORMANCE</title>
      <link>https://communities.sas.com/t5/SAS-Procedures/PROC-SQL-JOINS-PERFORMANCE/m-p/22277#M4864</link>
      <description>&lt;HTML&gt;&lt;HEAD&gt;&lt;/HEAD&gt;&lt;BODY&gt;&lt;P&gt;With only 100 records/values PGSTAT's approach is a popular strategy - building a text string of the ID values to be selected. If ID is a character column then you may have to wrap and values in quotes. It should give you much improved performance as all of the data selection happens in DB2 before SAS reads it.&amp;nbsp; &lt;/P&gt;&lt;/BODY&gt;&lt;/HTML&gt;</description>
      <pubDate>Mon, 05 Mar 2012 23:52:29 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Procedures/PROC-SQL-JOINS-PERFORMANCE/m-p/22277#M4864</guid>
      <dc:creator>SASKiwi</dc:creator>
      <dc:date>2012-03-05T23:52:29Z</dc:date>
    </item>
  </channel>
</rss>

