<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: SAS Table Creation with FULL OUTER JOIN takes inordinate time in SAS Programming</title>
    <link>https://communities.sas.com/t5/SAS-Programming/SAS-Table-Creation-with-FULL-OUTER-JOIN-takes-inordinate-time/m-p/645751#M193062</link>
    <description>&lt;P&gt;A few further comments:&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;1. Try to split the process happening in oracle and the process happening in SAS.&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Bringing whole tables to SAS is inefficient.&lt;/P&gt;
&lt;P&gt;For example&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;PRE&gt;FROM
BASE_FOODBORNE GEN_LDF
INNER JOIN &lt;FONT color="#0000FF"&gt;NBS_RDB.INVESTIGATION INV
ON
GEN_LDF.INVESTIGATION_UID=INV.CASE_UID
INNER JOIN NBS_RDB.GENERIC_CASE GEN
ON
GEN.INVESTIGATION_KEY=INV.INVESTIGATION_KEY
INNER JOIN NBS_RDB.CONDITION
ON
CONDITION.CONDITION_KEY= GEN.CONDITION_KEY
INNER JOIN NBS_RDB.D_PATIENT PATIENT
ON
PATIENT.PATIENT_KEY=GEN.PATIENT_KEY&lt;/FONT&gt;&lt;/PRE&gt;
&lt;P&gt;All the joins in blue are oracle tables.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Make the join happen there, or example by coding something like&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;PRE&gt;FROM
BASE_FOODBORNE GEN_LDF
INNER JOIN &lt;FONT color="#0000FF"&gt;&lt;STRONG&gt;&lt;U&gt;(select ???  from&lt;/U&gt;&lt;/STRONG&gt; NBS_RDB.INVESTIGATION INV
INNER JOIN NBS_RDB.GENERIC_CASE GEN
ON
GEN.INVESTIGATION_KEY=INV.INVESTIGATION_KEY
INNER JOIN NBS_RDB.CONDITION
ON
CONDITION.CONDITION_KEY= GEN.CONDITION_KEY
INNER JOIN NBS_RDB.D_PATIENT PATIENT
ON
PATIENT.PATIENT_KEY=GEN.PATIENT_KEY &lt;STRONG&gt; &lt;U&gt;)  k&lt;/U&gt;&lt;/STRONG&gt;&lt;/FONT&gt;&lt;/PRE&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;2. Use option &lt;STRONG&gt;sastrace&lt;/STRONG&gt; to verify what queries are sent to Oracle.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;3. In my experience a clause such as&amp;nbsp; &amp;nbsp;&amp;nbsp;&lt;FONT face="courier new,courier"&gt;where VAL in (select ..from... )&lt;/FONT&gt;&amp;nbsp; &amp;nbsp; is less efficient then a join. Use (typically inner) joins.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;4. How many values are returned by&amp;nbsp; &amp;nbsp; &lt;SPAN&gt;SELECT CONDITION_CD FROM&amp;nbsp;&lt;/SPAN&gt;&lt;SPAN&gt;NBS_RDB.LDF_DATAMART_TABLE_REF WHERE DATAMART_NAME = 'LDF_FOODBORNE'&amp;nbsp; &amp;nbsp;?&lt;/SPAN&gt;&lt;/P&gt;
&lt;P&gt;&lt;SPAN&gt;It might be better to create a static string using macro variables.&amp;nbsp;&lt;/SPAN&gt;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&lt;SPAN&gt;5. You should also add a DISTINCT in the query above if applicable&lt;/SPAN&gt;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;6. Do us and yourself a favour: format your code. It is illegible. For example&lt;/P&gt;
&lt;PRE&gt;CREATE TABLE ALL_FOODBORNE AS
SELECT A.*,
B.DATAMART_COLUMN_NM AS DM 'DM'
FROM NBS_RDB.LDF_DATAMART_COLUMN_REF B
FULL OUTER JOIN LINKED_FOODBORNE A
ON A.LDF_UID= B.LDF_UID WHERE
(B.LDF_PAGE_SET ='OTHER'
OR B.CONDITION_CD IN (SELECT CONDITION_CD FROM
NBS_RDB.LDF_DATAMART_TABLE_REF WHERE DATAMART_NAME = 'LDF_FOODBORNE')&lt;/PRE&gt;
&lt;P&gt;could be something like&lt;/P&gt;
&lt;PRE&gt;create table ALL_FOODBORNE as
select A.* 
      ,B.DATAMART_COLUMN_NM as DM 'DM'
from 
  NBS_RDB.LDF_DATAMART_COLUMN_REF  B
    full outer join 
  LINKED_FOODBORNE                 A
    on A.LDF_UID= B.LDF_UID
where
  (B.LDF_PAGE_SET ='OTHER'
  or B.CONDITION_CD in (select unique CONDITION_CD 
                        from NBS_RDB.LDF_DATAMART_TABLE_REF
                        where DATAMART_NAME = 'LDF_FOODBORNE')
  )&lt;/PRE&gt;
&lt;P&gt;A couple more lines, but 1000% easier to understand and debug.&lt;/P&gt;
&lt;P&gt;Note that in this example too, the 2 oracle tables should be queried together before being joined to the SAS table,&lt;/P&gt;
&lt;P&gt;which is perfect since the where clause is on the Oracle data.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
    <pubDate>Thu, 07 May 2020 02:43:20 GMT</pubDate>
    <dc:creator>ChrisNZ</dc:creator>
    <dc:date>2020-05-07T02:43:20Z</dc:date>
    <item>
      <title>SAS Table Creation with FULL OUTER JOIN takes inordinate time</title>
      <link>https://communities.sas.com/t5/SAS-Programming/SAS-Table-Creation-with-FULL-OUTER-JOIN-takes-inordinate-time/m-p/645496#M192986</link>
      <description>&lt;P&gt;Hello&amp;nbsp; Experts,&lt;/P&gt;&lt;P&gt;I am new to SAS programming and I am supporting a PROD ETL process written in SAS. It connects to Oracle DB and the processing is between Oracle and SAS. These processes run every day.&lt;/P&gt;&lt;P&gt;These ETL processes run for more than 17hrs. Looking to tune these processes, we noticed that&amp;nbsp;&lt;/P&gt;&lt;P&gt;at one particular point the script seems to hang for more than 8 hrs. On a good day this step completes in couple of mins.&lt;/P&gt;&lt;P&gt;The ETL processes are inter-related. Each step depends on its previous step.&lt;/P&gt;&lt;P&gt;Attached is a pic of the WORK Library where I see the server is chugging to create big TMP utility files.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Please help me understand the bottleneck here.&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Here is a snippet that just hangs:&amp;nbsp;&lt;/P&gt;&lt;P&gt;NDS_RDB is our oracle database. RDB is the SAS library.&amp;nbsp;LDF_DIMENSIONAL_DATA is an object created in the previous step.&lt;/P&gt;&lt;P&gt;It creates&amp;nbsp;BASE_FOODBORNE,&amp;nbsp;LINKED_FOODBORNE and ALL_FOODBORNE objects in the WORK library.&lt;/P&gt;&lt;P&gt;In the WORK library I see BASE_FOODBORNE and LINKED_FOODBORNE created.&amp;nbsp;&lt;/P&gt;&lt;P&gt;ALL_FOODBORNE is not seen and I see temp utility files growing big.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;PROC SQL;&lt;BR /&gt;DROP TABLE NBS_RDB.LDF_FOODBORNE;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;CREATE TABLE BASE_FOODBORNE AS&lt;BR /&gt;SELECT * FROM LDF_DIMENSIONAL_DATA WHERE PHC_CD IN (SELECT CONDITION_CD FROM&lt;BR /&gt;NBS_RDB.LDF_DATAMART_TABLE_REF WHERE DATAMART_NAME = 'LDF_FOODBORNE');&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;CREATE TABLE LINKED_FOODBORNE AS&lt;BR /&gt;SELECT GEN_LDF.*,&lt;BR /&gt;INV.INVESTIGATION_KEY,&lt;BR /&gt;INV.INV_LOCAL_ID AS INVESTIGATION_LOCAL_ID 'INVESTIGATION_LOCAL_ID',&lt;BR /&gt;INV.CASE_OID AS PROGRAM_JURISDICTION_OID 'PROGRAM_JURISDICTION_OID',&lt;BR /&gt;GEN.PATIENT_KEY,&lt;BR /&gt;PATIENT.PATIENT_LOCAL_ID AS PATIENT_LOCAL_ID 'PATIENT_LOCAL_ID',&lt;BR /&gt;CONDITION.CONDITION_SHORT_NM AS DISEASE_NAME 'DISEASE_NAME',&lt;BR /&gt;CONDITION.CONDITION_CD,&lt;BR /&gt;GEN_LDF.PHC_CD&lt;BR /&gt;FROM&lt;BR /&gt;BASE_FOODBORNE GEN_LDF&lt;BR /&gt;INNER JOIN NBS_RDB.INVESTIGATION INV&lt;BR /&gt;ON&lt;BR /&gt;GEN_LDF.INVESTIGATION_UID=INV.CASE_UID&lt;BR /&gt;INNER JOIN NBS_RDB.GENERIC_CASE GEN&lt;BR /&gt;ON&lt;BR /&gt;GEN.INVESTIGATION_KEY=INV.INVESTIGATION_KEY&lt;BR /&gt;INNER JOIN NBS_RDB.CONDITION&lt;BR /&gt;ON&lt;BR /&gt;CONDITION.CONDITION_KEY= GEN.CONDITION_KEY&lt;BR /&gt;INNER JOIN NBS_RDB.D_PATIENT PATIENT&lt;BR /&gt;ON&lt;BR /&gt;PATIENT.PATIENT_KEY=GEN.PATIENT_KEY&lt;BR /&gt;ORDER BY&lt;BR /&gt;INVESTIGATION_UID;&lt;/P&gt;&lt;P&gt;&lt;BR /&gt;CREATE TABLE ALL_FOODBORNE AS&lt;BR /&gt;SELECT A.*,&lt;BR /&gt;B.DATAMART_COLUMN_NM AS DM 'DM'&lt;BR /&gt;FROM NBS_RDB.LDF_DATAMART_COLUMN_REF B&lt;BR /&gt;FULL OUTER JOIN LINKED_FOODBORNE A&lt;BR /&gt;ON A.LDF_UID= B.LDF_UID WHERE&lt;BR /&gt;(B.LDF_PAGE_SET ='OTHER'&lt;BR /&gt;OR B.CONDITION_CD IN (SELECT CONDITION_CD FROM&lt;BR /&gt;NBS_RDB.LDF_DATAMART_TABLE_REF WHERE DATAMART_NAME = 'LDF_FOODBORNE')&lt;BR /&gt;)&lt;BR /&gt;ORDER BY&lt;BR /&gt;INVESTIGATION_UID;&lt;BR /&gt;QUIT;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Wed, 06 May 2020 01:36:37 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/SAS-Table-Creation-with-FULL-OUTER-JOIN-takes-inordinate-time/m-p/645496#M192986</guid>
      <dc:creator>MadhuriDara</dc:creator>
      <dc:date>2020-05-06T01:36:37Z</dc:date>
    </item>
    <item>
      <title>Re: SAS Table Creation with FULL OUTER JOIN takes inordinate time</title>
      <link>https://communities.sas.com/t5/SAS-Programming/SAS-Table-Creation-with-FULL-OUTER-JOIN-takes-inordinate-time/m-p/645655#M193038</link>
      <description>&lt;P&gt;I don't know how big are your tables but in your code there are many times approach to oracle database table which uses a lot of resources.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;you can save a lot of time if you work in a sas as much as possible.&lt;/P&gt;
&lt;P&gt;I have attached a ms-word document with your code with &lt;STRONG&gt;NDS_RDS.&lt;/STRONG&gt; marked in yellow.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;I suggest to load the oracle table into sas and eliminate all those references to oracle.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Wed, 06 May 2020 16:26:28 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/SAS-Table-Creation-with-FULL-OUTER-JOIN-takes-inordinate-time/m-p/645655#M193038</guid>
      <dc:creator>Shmuel</dc:creator>
      <dc:date>2020-05-06T16:26:28Z</dc:date>
    </item>
    <item>
      <title>Re: SAS Table Creation with FULL OUTER JOIN takes inordinate time</title>
      <link>https://communities.sas.com/t5/SAS-Programming/SAS-Table-Creation-with-FULL-OUTER-JOIN-takes-inordinate-time/m-p/645751#M193062</link>
      <description>&lt;P&gt;A few further comments:&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;1. Try to split the process happening in oracle and the process happening in SAS.&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Bringing whole tables to SAS is inefficient.&lt;/P&gt;
&lt;P&gt;For example&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;PRE&gt;FROM
BASE_FOODBORNE GEN_LDF
INNER JOIN &lt;FONT color="#0000FF"&gt;NBS_RDB.INVESTIGATION INV
ON
GEN_LDF.INVESTIGATION_UID=INV.CASE_UID
INNER JOIN NBS_RDB.GENERIC_CASE GEN
ON
GEN.INVESTIGATION_KEY=INV.INVESTIGATION_KEY
INNER JOIN NBS_RDB.CONDITION
ON
CONDITION.CONDITION_KEY= GEN.CONDITION_KEY
INNER JOIN NBS_RDB.D_PATIENT PATIENT
ON
PATIENT.PATIENT_KEY=GEN.PATIENT_KEY&lt;/FONT&gt;&lt;/PRE&gt;
&lt;P&gt;All the joins in blue are oracle tables.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Make the join happen there, or example by coding something like&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;PRE&gt;FROM
BASE_FOODBORNE GEN_LDF
INNER JOIN &lt;FONT color="#0000FF"&gt;&lt;STRONG&gt;&lt;U&gt;(select ???  from&lt;/U&gt;&lt;/STRONG&gt; NBS_RDB.INVESTIGATION INV
INNER JOIN NBS_RDB.GENERIC_CASE GEN
ON
GEN.INVESTIGATION_KEY=INV.INVESTIGATION_KEY
INNER JOIN NBS_RDB.CONDITION
ON
CONDITION.CONDITION_KEY= GEN.CONDITION_KEY
INNER JOIN NBS_RDB.D_PATIENT PATIENT
ON
PATIENT.PATIENT_KEY=GEN.PATIENT_KEY &lt;STRONG&gt; &lt;U&gt;)  k&lt;/U&gt;&lt;/STRONG&gt;&lt;/FONT&gt;&lt;/PRE&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;2. Use option &lt;STRONG&gt;sastrace&lt;/STRONG&gt; to verify what queries are sent to Oracle.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;3. In my experience a clause such as&amp;nbsp; &amp;nbsp;&amp;nbsp;&lt;FONT face="courier new,courier"&gt;where VAL in (select ..from... )&lt;/FONT&gt;&amp;nbsp; &amp;nbsp; is less efficient then a join. Use (typically inner) joins.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;4. How many values are returned by&amp;nbsp; &amp;nbsp; &lt;SPAN&gt;SELECT CONDITION_CD FROM&amp;nbsp;&lt;/SPAN&gt;&lt;SPAN&gt;NBS_RDB.LDF_DATAMART_TABLE_REF WHERE DATAMART_NAME = 'LDF_FOODBORNE'&amp;nbsp; &amp;nbsp;?&lt;/SPAN&gt;&lt;/P&gt;
&lt;P&gt;&lt;SPAN&gt;It might be better to create a static string using macro variables.&amp;nbsp;&lt;/SPAN&gt;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&lt;SPAN&gt;5. You should also add a DISTINCT in the query above if applicable&lt;/SPAN&gt;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;6. Do us and yourself a favour: format your code. It is illegible. For example&lt;/P&gt;
&lt;PRE&gt;CREATE TABLE ALL_FOODBORNE AS
SELECT A.*,
B.DATAMART_COLUMN_NM AS DM 'DM'
FROM NBS_RDB.LDF_DATAMART_COLUMN_REF B
FULL OUTER JOIN LINKED_FOODBORNE A
ON A.LDF_UID= B.LDF_UID WHERE
(B.LDF_PAGE_SET ='OTHER'
OR B.CONDITION_CD IN (SELECT CONDITION_CD FROM
NBS_RDB.LDF_DATAMART_TABLE_REF WHERE DATAMART_NAME = 'LDF_FOODBORNE')&lt;/PRE&gt;
&lt;P&gt;could be something like&lt;/P&gt;
&lt;PRE&gt;create table ALL_FOODBORNE as
select A.* 
      ,B.DATAMART_COLUMN_NM as DM 'DM'
from 
  NBS_RDB.LDF_DATAMART_COLUMN_REF  B
    full outer join 
  LINKED_FOODBORNE                 A
    on A.LDF_UID= B.LDF_UID
where
  (B.LDF_PAGE_SET ='OTHER'
  or B.CONDITION_CD in (select unique CONDITION_CD 
                        from NBS_RDB.LDF_DATAMART_TABLE_REF
                        where DATAMART_NAME = 'LDF_FOODBORNE')
  )&lt;/PRE&gt;
&lt;P&gt;A couple more lines, but 1000% easier to understand and debug.&lt;/P&gt;
&lt;P&gt;Note that in this example too, the 2 oracle tables should be queried together before being joined to the SAS table,&lt;/P&gt;
&lt;P&gt;which is perfect since the where clause is on the Oracle data.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Thu, 07 May 2020 02:43:20 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/SAS-Table-Creation-with-FULL-OUTER-JOIN-takes-inordinate-time/m-p/645751#M193062</guid>
      <dc:creator>ChrisNZ</dc:creator>
      <dc:date>2020-05-07T02:43:20Z</dc:date>
    </item>
  </channel>
</rss>

