<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: De-duplication in Data management Studio in SAS Data Management</title>
    <link>https://communities.sas.com/t5/SAS-Data-Management/De-duplication-in-Data-management-Studio/m-p/829174#M20522</link>
    <description>&lt;P&gt;Hi,&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;There are no definition in the QKB to match and cluster records by age.&lt;/P&gt;
&lt;P&gt;Also, I thought about your use case and I think it's not a good&amp;nbsp; solution. Let me explain what could happen:&lt;/P&gt;
&lt;P&gt;-&amp;gt; customer A is 20 and matches customer B who is 25,&lt;/P&gt;
&lt;P&gt;-&amp;gt; but customer C is 30, and therefore matches customer B,&lt;/P&gt;
&lt;P&gt;-&amp;gt; and, customer D is 35 and therefore matches customer C.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;In the end, A matches D, and you'll end up one big cluster with all of your records.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;So I think age should not be used this way.&amp;nbsp;Maybe there are other criteria in your data that would be better.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Hope this helps.&lt;/P&gt;
&lt;P&gt;Audrey&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
    <pubDate>Thu, 18 Aug 2022 11:57:04 GMT</pubDate>
    <dc:creator>audrey</dc:creator>
    <dc:date>2022-08-18T11:57:04Z</dc:date>
    <item>
      <title>De-duplication in Data management Studio</title>
      <link>https://communities.sas.com/t5/SAS-Data-Management/De-duplication-in-Data-management-Studio/m-p/829172#M20521</link>
      <description>&lt;P&gt;Dear Team,&amp;nbsp;&lt;/P&gt;
&lt;P&gt;I am facing one challenge that I have to find out duplicate records based on &lt;STRONG&gt;Age&lt;/STRONG&gt; variable. Age variable should be in one particular range.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Sample:&amp;nbsp;&lt;/P&gt;
&lt;P&gt;If beneficiaries's age is 29 and he/she trying to apply further by the different age like 25 or 34. I just want to de duplicate the data in Data Management Studio by the Age +5 largest and -5 smallest.&amp;nbsp;&lt;/P&gt;
&lt;P&gt;My Question is : How to Declare de duplication based on Age ?&lt;/P&gt;
&lt;P&gt;2. Is it possible to take Age in Match codes ?if yes, then which definition and sensitivity&amp;nbsp; suitable ?&lt;/P&gt;
&lt;P&gt;3. how to define Age like +5 and -5 ?&lt;/P&gt;
&lt;P&gt;3.&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Thu, 18 Aug 2022 11:31:08 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Data-Management/De-duplication-in-Data-management-Studio/m-p/829172#M20521</guid>
      <dc:creator>Shakti_Sourav</dc:creator>
      <dc:date>2022-08-18T11:31:08Z</dc:date>
    </item>
    <item>
      <title>Re: De-duplication in Data management Studio</title>
      <link>https://communities.sas.com/t5/SAS-Data-Management/De-duplication-in-Data-management-Studio/m-p/829174#M20522</link>
      <description>&lt;P&gt;Hi,&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;There are no definition in the QKB to match and cluster records by age.&lt;/P&gt;
&lt;P&gt;Also, I thought about your use case and I think it's not a good&amp;nbsp; solution. Let me explain what could happen:&lt;/P&gt;
&lt;P&gt;-&amp;gt; customer A is 20 and matches customer B who is 25,&lt;/P&gt;
&lt;P&gt;-&amp;gt; but customer C is 30, and therefore matches customer B,&lt;/P&gt;
&lt;P&gt;-&amp;gt; and, customer D is 35 and therefore matches customer C.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;In the end, A matches D, and you'll end up one big cluster with all of your records.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;So I think age should not be used this way.&amp;nbsp;Maybe there are other criteria in your data that would be better.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Hope this helps.&lt;/P&gt;
&lt;P&gt;Audrey&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Thu, 18 Aug 2022 11:57:04 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Data-Management/De-duplication-in-Data-management-Studio/m-p/829174#M20522</guid>
      <dc:creator>audrey</dc:creator>
      <dc:date>2022-08-18T11:57:04Z</dc:date>
    </item>
  </channel>
</rss>

