<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: Discarding duplicate events produced by QKB matchcode function in Streaming Analytics</title>
    <link>https://communities.sas.com/t5/Streaming-Analytics/Discarding-duplicate-events-produced-by-QKB-matchcode-function/m-p/524981#M118</link>
    <description>Thank you. I'll look into the proposed approaches.</description>
    <pubDate>Mon, 07 Jan 2019 07:46:52 GMT</pubDate>
    <dc:creator>Rain</dc:creator>
    <dc:date>2019-01-07T07:46:52Z</dc:date>
    <item>
      <title>Discarding duplicate events produced by QKB matchcode function</title>
      <link>https://communities.sas.com/t5/Streaming-Analytics/Discarding-duplicate-events-produced-by-QKB-matchcode-function/m-p/523019#M116</link>
      <description>&lt;P&gt;Hi.&lt;/P&gt;&lt;P&gt;I am using a compute window with data quality&amp;nbsp;QKB functions and they return duplicate events I need to discard in next window.&lt;/P&gt;&lt;P&gt;Here's an example:&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;This compute window takes company names as inputs. Single unique row that acts as a key field:&lt;/P&gt;&lt;P&gt;&lt;FONT size="2"&gt;&lt;EM&gt;Company&amp;nbsp;AB Industrial&lt;/EM&gt;&lt;/FONT&gt;&lt;/P&gt;&lt;P&gt;&lt;FONT size="2"&gt;&lt;EM&gt;Company Mechanics Ltd&lt;/EM&gt;&lt;/FONT&gt;&lt;/P&gt;&lt;P&gt;&lt;FONT size="2"&gt;&lt;EM&gt;Company BA Industrial&lt;/EM&gt;&lt;/FONT&gt;&lt;/P&gt;&lt;P&gt;&lt;FONT size="2"&gt;&lt;EM&gt;etc&lt;/EM&gt;&lt;/FONT&gt;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Then&amp;nbsp;I apply this function:&lt;/P&gt;&lt;P&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &lt;FONT size="2"&gt;&lt;STRONG&gt;&amp;lt;field-expr&amp;gt;&amp;lt;![CDATA[bf3.matchcode("Name", 95, CompanyName, result95) return result95]]&amp;gt;&amp;lt;/field-expr&amp;gt;&lt;/STRONG&gt;&lt;/FONT&gt;&lt;/P&gt;&lt;P&gt;Output is company name and matchcode:&lt;/P&gt;&lt;P&gt;&lt;FONT size="2"&gt;&lt;EM&gt;Company AB Industrial,&amp;nbsp;42&amp;amp;BF&amp;amp;7R7B4#7B8$$$$$$$$$$$$&lt;/EM&gt;&lt;/FONT&gt;&lt;/P&gt;&lt;P&gt;&lt;FONT size="2"&gt;&lt;EM&gt;Company Mechanics Ldt,&amp;nbsp;42&amp;amp;BF&amp;amp;7Y~Y&amp;amp;87BF$$$$$$$$$$$$&lt;/EM&gt;&lt;/FONT&gt;&lt;/P&gt;&lt;P&gt;&lt;FONT size="2"&gt;&lt;EM&gt;Company BA Industrial,&amp;nbsp;42&amp;amp;BF&amp;amp;7R7B4#7B8$$$$$$$$$$$&lt;/EM&gt;&lt;/FONT&gt;$&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Because I am using matchcode sensitivity 95 then first and third company get similar matchode values. My aim is to have a lookup window with unique matchcode values so I can compare these to data coming from another source window.&lt;/P&gt;&lt;P&gt;Any suggestion how I can get rid of duplicate matchode values? This means that input would be company name, matchode but output would be only unique matchode(this would become unique key field also). Ideally my previous example would produce only two rows:&lt;/P&gt;&lt;P&gt;&lt;FONT size="2"&gt;&lt;EM&gt;42&amp;amp;BF&amp;amp;7R7B4#7B8$$$$$$$$$$$$&lt;/EM&gt;&lt;/FONT&gt;&lt;/P&gt;&lt;P&gt;&lt;FONT size="2"&gt;&lt;EM&gt;42&amp;amp;BF&amp;amp;7Y~Y&amp;amp;87BF$$$$$$$$$$$$&lt;/EM&gt;&lt;/FONT&gt;&lt;/P&gt;&lt;P&gt;I tried union window(strict="false" output-insert-only="true") as it should prevent duplicate outputs but it seems to work only when duplicates origin from different windows that are connected to union window. We are using ESP 4.3.&lt;/P&gt;</description>
      <pubDate>Fri, 21 Dec 2018 07:46:50 GMT</pubDate>
      <guid>https://communities.sas.com/t5/Streaming-Analytics/Discarding-duplicate-events-produced-by-QKB-matchcode-function/m-p/523019#M116</guid>
      <dc:creator>Rain</dc:creator>
      <dc:date>2018-12-21T07:46:50Z</dc:date>
    </item>
    <item>
      <title>Re: Discarding duplicate events produced by QKB matchcode function</title>
      <link>https://communities.sas.com/t5/Streaming-Analytics/Discarding-duplicate-events-produced-by-QKB-matchcode-function/m-p/524315#M117</link>
      <description>&lt;P&gt;Hi Rain,&lt;/P&gt;
&lt;P&gt;This is a little tricky.&amp;nbsp; How many match codes could you potentially get?&amp;nbsp; An aggregate window could group events by the matchcode.&amp;nbsp; But you would need to worry about storage growth in the aggregate window because nothing will remove the old events.&lt;/P&gt;
&lt;P&gt;.&lt;/P&gt;
&lt;P&gt;You could write a Python or DS2 routine with a hash table that uses the matchcode has the hash key.&amp;nbsp; Then, you could keep a record of the matchcode values that have been processed.&amp;nbsp; An event would only be output if it was a new matchcode.&amp;nbsp; That wouldn't have any memory concerns, as you could make the window pi_EMPTY.&lt;/P&gt;
&lt;P&gt;Thanks,&lt;/P&gt;
&lt;P&gt;Andy&lt;/P&gt;</description>
      <pubDate>Thu, 03 Jan 2019 16:39:47 GMT</pubDate>
      <guid>https://communities.sas.com/t5/Streaming-Analytics/Discarding-duplicate-events-produced-by-QKB-matchcode-function/m-p/524315#M117</guid>
      <dc:creator>AndyT_SAS</dc:creator>
      <dc:date>2019-01-03T16:39:47Z</dc:date>
    </item>
    <item>
      <title>Re: Discarding duplicate events produced by QKB matchcode function</title>
      <link>https://communities.sas.com/t5/Streaming-Analytics/Discarding-duplicate-events-produced-by-QKB-matchcode-function/m-p/524981#M118</link>
      <description>Thank you. I'll look into the proposed approaches.</description>
      <pubDate>Mon, 07 Jan 2019 07:46:52 GMT</pubDate>
      <guid>https://communities.sas.com/t5/Streaming-Analytics/Discarding-duplicate-events-produced-by-QKB-matchcode-function/m-p/524981#M118</guid>
      <dc:creator>Rain</dc:creator>
      <dc:date>2019-01-07T07:46:52Z</dc:date>
    </item>
  </channel>
</rss>

