<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: Refining output of a join to exclude dup value created in SAS Programming</title>
    <link>https://communities.sas.com/t5/SAS-Programming/Refining-output-of-a-join-to-exclude-dup-value-created/m-p/411895#M279727</link>
    <description>&lt;P&gt;thanks for your suggestions.&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;- "sample of output" is actual result (I added a first column to indicate the rows I ideally want to retain)&lt;BR /&gt;&lt;BR /&gt;&lt;/P&gt;&lt;P&gt;- Each unique id in Unique_ID_A, I only want to see once (e.g., 975059187 appears twice)&lt;BR /&gt;&lt;BR /&gt;&lt;/P&gt;&lt;P&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; -Additionally, there is also a unique identifier from table B (i.e., I would like to retain 536184298 linked with&amp;nbsp;&lt;SPAN&gt;975059187)&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&lt;SPAN&gt;-Some details on this data: table B captures anytime an agent processes a credit on an account. Could be more than 1 credit for same account same day, and could get deleted prior to getting processed.&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN&gt;table A captures credits that did get processed. The objective is to identify which credits did get processed for a given agent.&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&lt;SPAN&gt;-Additional rules: the linking variable between both tables is the acct_id. Then the criteria is, match by date, and amount.&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp;- Also, each table contains a unique identifier whose sequence is linear (a greater number means the credit got processed later)&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&lt;SPAN&gt;Perhaps a data step merge is best approach to do the join?&amp;nbsp;&lt;/SPAN&gt;&lt;/P&gt;</description>
    <pubDate>Thu, 09 Nov 2017 12:33:37 GMT</pubDate>
    <dc:creator>brulard</dc:creator>
    <dc:date>2017-11-09T12:33:37Z</dc:date>
    <item>
      <title>Refining output of a join to exclude dup value created</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Refining-output-of-a-join-to-exclude-dup-value-created/m-p/411730#M279725</link>
      <description>&lt;P&gt;If someone could provide guidance on this issue. After doing an SQL inner join with two tables, there are duplicate records created.&lt;/P&gt;&lt;P&gt;Is there a join option, or string that can be added to a query to essentially dictate "move in a sequential order to match records from table 1 ONCE to matching record of table B, then move on to next row" (in other words, do not match a given value from table A more than once with value from table B). Apologies if this sounds crude.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;The query I ran:&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;proc sql; create table DID as&lt;BR /&gt;select distinct&lt;/P&gt;&lt;P&gt;a.*,&lt;BR /&gt;b.TXN_DT AS TRANSACTION_DATE,b.MSTR_CRD_TXN_ID,b.POSTNG_DT,b.SRC_SYS_POSTNG_TM_CNT&lt;BR /&gt;FROM Z4 A inner join (select distinct * from TRANS) B ON&lt;BR /&gt;A.ACCT_ID=B.ACCT_ID AND B.TXN_AMT=A.AMOUNT and (A.EVENT_DT=B.POSTNG_DT or A.EVENT_DT=B.POSTNG_DT-1 or A.EVENT_DT=B.POSTNG_DT-2 or A.EVENT_DT=B.POSTNG_DT-3) ;QUIT;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;/*the reason for the join criterion to various&amp;nbsp;&lt;SPAN&gt;POSTNG_DT (dates) is that there sometime is a time lapse between when records from table a get loaded in table b*/&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Sample of current output:&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;TABLE&gt;&lt;TBODY&gt;&lt;TR&gt;&lt;TD&gt;WANT?&lt;/TD&gt;&lt;TD&gt;UNIQUE_ID_B&lt;/TD&gt;&lt;TD&gt;UNIQUE_ID_A&lt;/TD&gt;&lt;/TR&gt;&lt;TR&gt;&lt;TD&gt;YES&lt;/TD&gt;&lt;TD&gt;536184298&lt;/TD&gt;&lt;TD&gt;975059187&lt;/TD&gt;&lt;/TR&gt;&lt;TR&gt;&lt;TD&gt;NO&lt;/TD&gt;&lt;TD&gt;536184298&lt;/TD&gt;&lt;TD&gt;975059188&lt;/TD&gt;&lt;/TR&gt;&lt;TR&gt;&lt;TD&gt;NO&lt;/TD&gt;&lt;TD&gt;536184307&lt;/TD&gt;&lt;TD&gt;975059187&lt;/TD&gt;&lt;/TR&gt;&lt;TR&gt;&lt;TD&gt;YES&lt;/TD&gt;&lt;TD&gt;536184307&lt;/TD&gt;&lt;TD&gt;975059188&lt;/TD&gt;&lt;/TR&gt;&lt;/TBODY&gt;&lt;/TABLE&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Notice that when selecting which of the two&amp;nbsp;&lt;SPAN&gt;975059187 from UNIQUE_ID_A, the row desired has a lower value in&amp;nbsp;UNIQUE_ID_B while the opposite occurs selecting which to pick for&amp;nbsp;975059188 (the value in B is higher).&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;And here is my attempt using a BASES sas merge, (but a possible issue here is I am missing some records in the output):&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;proc sort data=trans; by ACCT_ID POSTNG_DT SRC_SYS_POSTNG_TM_CNT TXN_AMT MSTR_CRD_TXN_ID;run;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;proc sort data=z4; by ACCT_ID EVENT_DT amount event_id;run;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;data merged;&lt;BR /&gt;merge trans (in=a KEEP=MSTR_CRD_TXN_ID SRC_SYS_ACCT_ID POSTNG_DT SRC_SYS_POSTNG_TM_CNT txn_amt)&lt;BR /&gt;z4 (IN=b) ;&lt;BR /&gt;by SRC_SYS_ACCT_ID ;&lt;BR /&gt;IF (POSTNG_DT-3=EVENT_DT OR POSTNG_DT-2=EVENT_DT or POSTNG_DT-1=EVENT_DT or POSTNG_DT=EVENT_DT) AND TXN_AMT=AMOUNT;RUN;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Thanks in advance&lt;/P&gt;</description>
      <pubDate>Thu, 09 Nov 2017 11:41:28 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Refining-output-of-a-join-to-exclude-dup-value-created/m-p/411730#M279725</guid>
      <dc:creator>brulard</dc:creator>
      <dc:date>2017-11-09T11:41:28Z</dc:date>
    </item>
    <item>
      <title>Re: Refining output of a join to exclude dups created</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Refining-output-of-a-join-to-exclude-dup-value-created/m-p/411747#M279726</link>
      <description>&lt;P&gt;It isn't clear if your "sample of output" is your actual result or your desired result.&lt;/P&gt;
&lt;P&gt;If it is your actual result then I do not see any "duplicate". If you have some additional rule about relationships between Unique_Id_B and Unique_Id_A I think you need to be a bit more detailed.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;It might help to provide short examples of the two sets and desired result.&lt;/P&gt;</description>
      <pubDate>Wed, 08 Nov 2017 22:59:33 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Refining-output-of-a-join-to-exclude-dup-value-created/m-p/411747#M279726</guid>
      <dc:creator>ballardw</dc:creator>
      <dc:date>2017-11-08T22:59:33Z</dc:date>
    </item>
    <item>
      <title>Re: Refining output of a join to exclude dup value created</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Refining-output-of-a-join-to-exclude-dup-value-created/m-p/411895#M279727</link>
      <description>&lt;P&gt;thanks for your suggestions.&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;- "sample of output" is actual result (I added a first column to indicate the rows I ideally want to retain)&lt;BR /&gt;&lt;BR /&gt;&lt;/P&gt;&lt;P&gt;- Each unique id in Unique_ID_A, I only want to see once (e.g., 975059187 appears twice)&lt;BR /&gt;&lt;BR /&gt;&lt;/P&gt;&lt;P&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; -Additionally, there is also a unique identifier from table B (i.e., I would like to retain 536184298 linked with&amp;nbsp;&lt;SPAN&gt;975059187)&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&lt;SPAN&gt;-Some details on this data: table B captures anytime an agent processes a credit on an account. Could be more than 1 credit for same account same day, and could get deleted prior to getting processed.&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN&gt;table A captures credits that did get processed. The objective is to identify which credits did get processed for a given agent.&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&lt;SPAN&gt;-Additional rules: the linking variable between both tables is the acct_id. Then the criteria is, match by date, and amount.&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp;- Also, each table contains a unique identifier whose sequence is linear (a greater number means the credit got processed later)&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&lt;SPAN&gt;Perhaps a data step merge is best approach to do the join?&amp;nbsp;&lt;/SPAN&gt;&lt;/P&gt;</description>
      <pubDate>Thu, 09 Nov 2017 12:33:37 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Refining-output-of-a-join-to-exclude-dup-value-created/m-p/411895#M279727</guid>
      <dc:creator>brulard</dc:creator>
      <dc:date>2017-11-09T12:33:37Z</dc:date>
    </item>
  </channel>
</rss>

