<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: SQL QUERY DUPLICATES from Join Error? in New SAS User</title>
    <link>https://communities.sas.com/t5/New-SAS-User/SQL-QUERY-DUPLICATES-from-Join-Error/m-p/863080#M38126</link>
    <description>&lt;P&gt;Not an error in way shape or form, it is a designed and useful feature of SQL. Any of the JOINs with duplicate values of ON variables will by default create a combination of the two records.&lt;/P&gt;
&lt;P&gt;A small example that you can run to see what what happens IF you are using SAS. The small number of records, variables and unique values below make it pretty easy to see that where X is duplicated in both set One and Two that the Join matches all the records from One to all of the matches in Two. This behavior is pretty much a standard in all of the SQL variants (and you aren't using SAS Proc SQL but possibly Oracle?)&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;PRE&gt;data one;
   input x y  ;
datalines;
1  1
1  2
1  3
2  4
;

data two;
  input x z ;
datalines;
1  100
1  200
2  6
;

proc sql;
   create table example as
   select a.x, a.y, b.z
   from one as a
        left join
        two as b
        on a.x=b.x
   ;
quit;&lt;/PRE&gt;
&lt;P&gt;Solution: If your results are truly actual duplicates, as in every single resulting variable you could try adding the DISTINCT predicate as the start of the Select. Caution: there can be a lot of computational overhead with this instruction. That query already has a number of potential execution time flags with multiple subqueries in different places.&lt;/P&gt;
&lt;P&gt;Other wise you need to reduce the duplicates earlier which could mean breaking that query up into smaller pieces.&lt;/P&gt;</description>
    <pubDate>Wed, 08 Mar 2023 22:04:34 GMT</pubDate>
    <dc:creator>ballardw</dc:creator>
    <dc:date>2023-03-08T22:04:34Z</dc:date>
    <item>
      <title>SQL QUERY DUPLICATES from Join Error?</title>
      <link>https://communities.sas.com/t5/New-SAS-User/SQL-QUERY-DUPLICATES-from-Join-Error/m-p/863066#M38125</link>
      <description>&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Please help to clarify what record / join is causing the thousands of duplicates. Kindly note, I have no formal training in SQL and appreciate simplified explanation or directions of how to correct. Thank you in advance!&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Goal is to create a date or date range query which identifies Sales Department Closed/Voided Orders associated with Finance Department Open Items (item represents money)&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Below is the SQL statement&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;TABLE border="0" cellspacing="0" cellpadding="0"&gt;&lt;TBODY&gt;&lt;TR&gt;&lt;TD&gt;&lt;DIV&gt;Query SQL&lt;/DIV&gt;&lt;/TD&gt;&lt;/TR&gt;&lt;TR&gt;&lt;TD&gt;&amp;nbsp;&lt;/TD&gt;&lt;TD&gt;&lt;DIV&gt;&lt;SPAN class=""&gt;SELECT TO_CHAR(A.BOOKING_DT_FZ,'YYYY-MM-DD'), A.PREPARED_BY_FZ, C.ORDERED_FROM, A.CUST_ID, A.ORDER_REF_FZ, B.EQUIP_FZ, A.DEPOT_CD_FZ, B.BOOKING_QTY_FZ, B.PICKUP_QTY_FZ, ( B.BOOKING_QTY_FZ)-( B.PICKUP_QTY_FZ), TO_CHAR(CAST((A.FIRST_SEND_DT_FZ) AS TIMESTAMP),'YYYY-MM-DD-HH24.MI.SS.FF'), TO_DATE(TO_CHAR(SYSDATE,'YYYY-MM-DD'),'YYYY-MM-DD') - TO_DATE( TO_CHAR(A.BOOKING_DT_FZ,'YYYY-MM-DD'),'YYYY-MM-DD'), C.BOOKING_PRICE, C.TAX_AMT, C.DISCOUNT_AMT, C.DISCOUNTED_PRICE, C.DISCOUNT_TYPE_FZ, C.USED_REWARD_POINT, C.RESALE_COUPON_ID, C.PAYMENT_METHOD, A.ORDER_STATUS_FZ, D.ITEM, D.ITEM_STATUS, D.BAL_AMT&lt;BR /&gt;&amp;nbsp; FROM (((PS_FZ_BOOKING A INNER JOIN PS_FZ_SP_BU_OPSCLS A1 ON (A.BUSINESS_UNIT = A1.BUSINESS_UNIT AND&amp;nbsp; A1.OPRCLASS = 'HKBUS' )) LEFT OUTER JOIN&amp;nbsp; PS_FZ_W_RS_ORDER C ON&amp;nbsp; A.ORDER_REF_FZ = C.ORDER_REF_FZ ) LEFT OUTER JOIN&amp;nbsp; PS_ITEM_DSP_VW D ON&amp;nbsp; D.CUST_ID = C.CUST_ID ), PS_FZ_BOOK_EQUIP B, PS_FZ_SP_BU_OPSCLS B1&lt;BR /&gt;&amp;nbsp; WHERE ( B.BUSINESS_UNIT = B1.BUSINESS_UNIT&lt;BR /&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp; AND B1.OPRCLASS = 'HKBUS'&lt;BR /&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp; AND ( A.MOVE_TYPE_FZ = 'RSL'&lt;BR /&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; AND A.ORDER_STATUS_FZ IN ('C','O','V')&lt;BR /&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; AND A.BUSINESS_UNIT = B.BUSINESS_UNIT&lt;BR /&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; AND A.PREPARED_BY_FZ IN ('LHALL','PAGRILLO','FGABRIELE','CRISTIAN','VINCCODI','RYANCHOI','ATAMBINI')&lt;BR /&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; AND A.CUST_ID = :3&lt;BR /&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; AND A.ORDER_REF_FZ = B.ORDER_REF_FZ&lt;BR /&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; AND B.BOOKING_QTY_FZ &amp;lt;&amp;gt; B.PICKUP_QTY_FZ&lt;BR /&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; AND A.BOOKING_DT_FZ BETWEEN TO_DATE(:1,'YYYY-MM-DD') AND TO_DATE(:2,'YYYY-MM-DD')&lt;BR /&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; AND D.EFFDT =&lt;BR /&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; (SELECT MAX(D_ED.EFFDT) FROM PS_ITEM_DSP_VW D_ED&lt;BR /&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; WHERE D.SETID = D_ED.SETID&lt;BR /&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; AND D.BUSINESS_UNIT = D_ED.BUSINESS_UNIT&lt;BR /&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; AND D.CUST_ID = D_ED.CUST_ID&lt;BR /&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; AND D.ITEM = D_ED.ITEM&lt;BR /&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; AND D.ITEM_LINE = D_ED.ITEM_LINE&lt;BR /&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; AND D_ED.EFFDT &amp;lt;= SYSDATE) ))&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&amp;nbsp;&lt;/DIV&gt;&lt;DIV&gt;&lt;span class="lia-inline-image-display-wrapper lia-image-align-inline" image-alt="amartin1_0-1678304444941.png" style="width: 758px;"&gt;&lt;img src="https://communities.sas.com/t5/image/serverpage/image-id/81233i8589CB22DA1EACA8/image-dimensions/758x91?v=v2" width="758" height="91" role="button" title="amartin1_0-1678304444941.png" alt="amartin1_0-1678304444941.png" /&gt;&lt;/span&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;/DIV&gt;&lt;/TD&gt;&lt;/TR&gt;&lt;/TBODY&gt;&lt;/TABLE&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Wed, 08 Mar 2023 20:00:38 GMT</pubDate>
      <guid>https://communities.sas.com/t5/New-SAS-User/SQL-QUERY-DUPLICATES-from-Join-Error/m-p/863066#M38125</guid>
      <dc:creator>amartin1</dc:creator>
      <dc:date>2023-03-08T20:00:38Z</dc:date>
    </item>
    <item>
      <title>Re: SQL QUERY DUPLICATES from Join Error?</title>
      <link>https://communities.sas.com/t5/New-SAS-User/SQL-QUERY-DUPLICATES-from-Join-Error/m-p/863080#M38126</link>
      <description>&lt;P&gt;Not an error in way shape or form, it is a designed and useful feature of SQL. Any of the JOINs with duplicate values of ON variables will by default create a combination of the two records.&lt;/P&gt;
&lt;P&gt;A small example that you can run to see what what happens IF you are using SAS. The small number of records, variables and unique values below make it pretty easy to see that where X is duplicated in both set One and Two that the Join matches all the records from One to all of the matches in Two. This behavior is pretty much a standard in all of the SQL variants (and you aren't using SAS Proc SQL but possibly Oracle?)&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;PRE&gt;data one;
   input x y  ;
datalines;
1  1
1  2
1  3
2  4
;

data two;
  input x z ;
datalines;
1  100
1  200
2  6
;

proc sql;
   create table example as
   select a.x, a.y, b.z
   from one as a
        left join
        two as b
        on a.x=b.x
   ;
quit;&lt;/PRE&gt;
&lt;P&gt;Solution: If your results are truly actual duplicates, as in every single resulting variable you could try adding the DISTINCT predicate as the start of the Select. Caution: there can be a lot of computational overhead with this instruction. That query already has a number of potential execution time flags with multiple subqueries in different places.&lt;/P&gt;
&lt;P&gt;Other wise you need to reduce the duplicates earlier which could mean breaking that query up into smaller pieces.&lt;/P&gt;</description>
      <pubDate>Wed, 08 Mar 2023 22:04:34 GMT</pubDate>
      <guid>https://communities.sas.com/t5/New-SAS-User/SQL-QUERY-DUPLICATES-from-Join-Error/m-p/863080#M38126</guid>
      <dc:creator>ballardw</dc:creator>
      <dc:date>2023-03-08T22:04:34Z</dc:date>
    </item>
  </channel>
</rss>

