<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: Duplicate check across columns in SAS Procedures</title>
    <link>https://communities.sas.com/t5/SAS-Procedures/Duplicate-check-across-columns/m-p/361436#M64483</link>
    <description>Sure here you go:&lt;BR /&gt;&lt;BR /&gt;ID1 ID2 ID3 Gender&lt;BR /&gt;123 200 300 F&lt;BR /&gt;456 400 789 F&lt;BR /&gt;789 123 800 F&lt;BR /&gt;100 600 900 M&lt;BR /&gt;&lt;BR /&gt;</description>
    <pubDate>Thu, 25 May 2017 00:11:28 GMT</pubDate>
    <dc:creator>TechRA</dc:creator>
    <dc:date>2017-05-25T00:11:28Z</dc:date>
    <item>
      <title>Duplicate check across columns</title>
      <link>https://communities.sas.com/t5/SAS-Procedures/Duplicate-check-across-columns/m-p/361419#M64481</link>
      <description>I have a dataset that has multiple identifiers, and I want to check for duplicates across each of them. For example, each row has ID1, ID2, and ID3.&lt;BR /&gt;&lt;BR /&gt;I want to know if ID1 from one row matches to ID2 or ID3 in another row (not within the same row). I can't think of an easy way to do this without restructuring the dataset. I'd prefer to do this check in SQL but can't figure out how. But I'd be satisfied with being able to do this check without restructuring. All I would want to do is identify the ID1 value that matches to ID2 or ID3 in another row.&lt;BR /&gt;&lt;BR /&gt;Suggestions appreciated.</description>
      <pubDate>Wed, 24 May 2017 22:43:17 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Procedures/Duplicate-check-across-columns/m-p/361419#M64481</guid>
      <dc:creator>TechRA</dc:creator>
      <dc:date>2017-05-24T22:43:17Z</dc:date>
    </item>
    <item>
      <title>Re: Duplicate check across columns</title>
      <link>https://communities.sas.com/t5/SAS-Procedures/Duplicate-check-across-columns/m-p/361435#M64482</link>
      <description>&lt;P&gt;Can you post some sample data please.&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Thu, 25 May 2017 00:00:36 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Procedures/Duplicate-check-across-columns/m-p/361435#M64482</guid>
      <dc:creator>Reeza</dc:creator>
      <dc:date>2017-05-25T00:00:36Z</dc:date>
    </item>
    <item>
      <title>Re: Duplicate check across columns</title>
      <link>https://communities.sas.com/t5/SAS-Procedures/Duplicate-check-across-columns/m-p/361436#M64483</link>
      <description>Sure here you go:&lt;BR /&gt;&lt;BR /&gt;ID1 ID2 ID3 Gender&lt;BR /&gt;123 200 300 F&lt;BR /&gt;456 400 789 F&lt;BR /&gt;789 123 800 F&lt;BR /&gt;100 600 900 M&lt;BR /&gt;&lt;BR /&gt;</description>
      <pubDate>Thu, 25 May 2017 00:11:28 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Procedures/Duplicate-check-across-columns/m-p/361436#M64483</guid>
      <dc:creator>TechRA</dc:creator>
      <dc:date>2017-05-25T00:11:28Z</dc:date>
    </item>
    <item>
      <title>Re: Duplicate check across columns</title>
      <link>https://communities.sas.com/t5/SAS-Procedures/Duplicate-check-across-columns/m-p/361465#M64486</link>
      <description>&lt;P&gt;You may have to fix my syntax here, but this approach should work:&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;proc sql;&lt;/P&gt;
&lt;P&gt;create table matches as select a.*, b.id2 as matching_id2, b.id3 as matching_id3 from&amp;nbsp;&lt;/P&gt;
&lt;P&gt;have a, have b&lt;/P&gt;
&lt;P&gt;where a.id1 ne&amp;nbsp;b.id1 and (a.id1=b.id2 or a.id1=b.id3);&lt;/P&gt;
&lt;P&gt;quit;&lt;/P&gt;</description>
      <pubDate>Thu, 25 May 2017 03:09:47 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Procedures/Duplicate-check-across-columns/m-p/361465#M64486</guid>
      <dc:creator>Astounding</dc:creator>
      <dc:date>2017-05-25T03:09:47Z</dc:date>
    </item>
    <item>
      <title>Re: Duplicate check across columns</title>
      <link>https://communities.sas.com/t5/SAS-Procedures/Duplicate-check-across-columns/m-p/361584#M64489</link>
      <description>Thank you. I hate to be too basic here but the last two lines are giving me errors on the syntax so I've pasted it below for correction. What am I doing wrong? Also could you explain the logic of what the having line is doing?&lt;BR /&gt;&lt;BR /&gt;proc sql;&lt;BR /&gt;&lt;BR /&gt;create table matches as select a.*, b.id2 as matching_id2, b.id3 as matching_id3 from datasetname&lt;BR /&gt;&lt;BR /&gt;having a, having b&lt;BR /&gt;&lt;BR /&gt;where a.id1 ne b.id1 and (a.id1=b.id2 or a.id1=b.id3);&lt;BR /&gt;&lt;BR /&gt;quit;&lt;BR /&gt;&lt;BR /&gt;</description>
      <pubDate>Thu, 25 May 2017 13:24:17 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Procedures/Duplicate-check-across-columns/m-p/361584#M64489</guid>
      <dc:creator>TechRA</dc:creator>
      <dc:date>2017-05-25T13:24:17Z</dc:date>
    </item>
    <item>
      <title>Re: Duplicate check across columns</title>
      <link>https://communities.sas.com/t5/SAS-Procedures/Duplicate-check-across-columns/m-p/361593#M64491</link>
      <description>&lt;P&gt;It looks like&amp;nbsp; you dropped the word "FROM" within the SELECT statement.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;The word HAVE (or HAVING) is supposed to represent the name of your data set.&amp;nbsp; Most solutions that are posted here will use WANT as the name of the data set you want to obtain, and HAVE as the name of the data set you are starting with.&amp;nbsp; So just replace that with the name of your data set.&lt;/P&gt;</description>
      <pubDate>Thu, 25 May 2017 13:47:18 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Procedures/Duplicate-check-across-columns/m-p/361593#M64491</guid>
      <dc:creator>Astounding</dc:creator>
      <dc:date>2017-05-25T13:47:18Z</dc:date>
    </item>
    <item>
      <title>Re: Duplicate check across columns</title>
      <link>https://communities.sas.com/t5/SAS-Procedures/Duplicate-check-across-columns/m-p/361597#M64493</link>
      <description>To me the data structure looks awkward. Shouldn't the data be transposed? Then it would much easier to keep track on duplicates and other processing as well.</description>
      <pubDate>Thu, 25 May 2017 13:49:55 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Procedures/Duplicate-check-across-columns/m-p/361597#M64493</guid>
      <dc:creator>LinusH</dc:creator>
      <dc:date>2017-05-25T13:49:55Z</dc:date>
    </item>
    <item>
      <title>Re: Duplicate check across columns</title>
      <link>https://communities.sas.com/t5/SAS-Procedures/Duplicate-check-across-columns/m-p/361800#M64509</link>
      <description>Thanks for the help all! Took me a little bit to figure out the unexpected results which was primarily due not using an additional identifier for this check.&lt;BR /&gt;&lt;BR /&gt;Also I agree that transposing is the simplest option. However, the ultimate goal was to keep the dataset at one observation per person so I thought it might be more efficient to keep it flat rather than restructure and reflatten.</description>
      <pubDate>Thu, 25 May 2017 22:28:05 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Procedures/Duplicate-check-across-columns/m-p/361800#M64509</guid>
      <dc:creator>TechRA</dc:creator>
      <dc:date>2017-05-25T22:28:05Z</dc:date>
    </item>
  </channel>
</rss>

