<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic How to use proc sql to merge and match by unique ID? in SAS Programming</title>
    <link>https://communities.sas.com/t5/SAS-Programming/How-to-use-proc-sql-to-merge-and-match-by-unique-ID/m-p/892183#M352396</link>
    <description>&lt;P&gt;How to use proc sql to merge 2 datasets with similar variables (4 different variables in one dataset) but want to match it by a Unique_ID ?&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;I have 2 datasets dataset1 has about 20 variables including a variabled called&amp;nbsp;Unique_ID.&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;dataset2 has about 24 variables also including&amp;nbsp;Unique_ID&amp;nbsp; but has 4 other variables that I want to merge into a table with.&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;I want to merge these tables and match it by Unique_ID.&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;I've tried the code below but when I do, it will multiply my observations which juristically increases my rows and I do not want that. What am I doing wrong?&lt;/P&gt;&lt;PRE&gt;&lt;CODE class=""&gt;proc sql;
    create table test as
    select *
    from dataset1, dataset2
    where dataset1.Unique_ID=dataset2.Unique_ID;
    quit;&lt;/CODE&gt;&lt;/PRE&gt;</description>
    <pubDate>Thu, 31 Aug 2023 21:00:23 GMT</pubDate>
    <dc:creator>rebelde52</dc:creator>
    <dc:date>2023-08-31T21:00:23Z</dc:date>
    <item>
      <title>How to use proc sql to merge and match by unique ID?</title>
      <link>https://communities.sas.com/t5/SAS-Programming/How-to-use-proc-sql-to-merge-and-match-by-unique-ID/m-p/892183#M352396</link>
      <description>&lt;P&gt;How to use proc sql to merge 2 datasets with similar variables (4 different variables in one dataset) but want to match it by a Unique_ID ?&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;I have 2 datasets dataset1 has about 20 variables including a variabled called&amp;nbsp;Unique_ID.&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;dataset2 has about 24 variables also including&amp;nbsp;Unique_ID&amp;nbsp; but has 4 other variables that I want to merge into a table with.&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;I want to merge these tables and match it by Unique_ID.&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;I've tried the code below but when I do, it will multiply my observations which juristically increases my rows and I do not want that. What am I doing wrong?&lt;/P&gt;&lt;PRE&gt;&lt;CODE class=""&gt;proc sql;
    create table test as
    select *
    from dataset1, dataset2
    where dataset1.Unique_ID=dataset2.Unique_ID;
    quit;&lt;/CODE&gt;&lt;/PRE&gt;</description>
      <pubDate>Thu, 31 Aug 2023 21:00:23 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/How-to-use-proc-sql-to-merge-and-match-by-unique-ID/m-p/892183#M352396</guid>
      <dc:creator>rebelde52</dc:creator>
      <dc:date>2023-08-31T21:00:23Z</dc:date>
    </item>
    <item>
      <title>Re: How to use proc sql to merge and match by unique ID?</title>
      <link>https://communities.sas.com/t5/SAS-Programming/How-to-use-proc-sql-to-merge-and-match-by-unique-ID/m-p/892187#M352398</link>
      <description>&lt;P&gt;If the number of rows increase in TEST, that means that either the column dataset1.Unique_ID or column dataset2.Unique_ID or both do NOT contain unique values for each row. Do you expect them to be unique?&lt;/P&gt;</description>
      <pubDate>Thu, 31 Aug 2023 21:25:59 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/How-to-use-proc-sql-to-merge-and-match-by-unique-ID/m-p/892187#M352398</guid>
      <dc:creator>SASKiwi</dc:creator>
      <dc:date>2023-08-31T21:25:59Z</dc:date>
    </item>
    <item>
      <title>Re: How to use proc sql to merge and match by unique ID?</title>
      <link>https://communities.sas.com/t5/SAS-Programming/How-to-use-proc-sql-to-merge-and-match-by-unique-ID/m-p/892189#M352399</link>
      <description>I excpect for both of them to have it. But if one or neither have it then how should i approach?</description>
      <pubDate>Thu, 31 Aug 2023 21:53:11 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/How-to-use-proc-sql-to-merge-and-match-by-unique-ID/m-p/892189#M352399</guid>
      <dc:creator>rebelde52</dc:creator>
      <dc:date>2023-08-31T21:53:11Z</dc:date>
    </item>
    <item>
      <title>Re: How to use proc sql to merge and match by unique ID?</title>
      <link>https://communities.sas.com/t5/SAS-Programming/How-to-use-proc-sql-to-merge-and-match-by-unique-ID/m-p/892191#M352400</link>
      <description>&lt;P&gt;If you are expecting your row IDs to be unique, but they are not then you need to investigate why and then decide on a process to make them unique. This is a data understanding issue. You may find PROC FREQ&amp;nbsp; useful to understanding where your duplicates are:&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;proc freq data = test;
  table uniqueID1 uniqueID2 / missing;
run;&lt;/CODE&gt;&lt;/PRE&gt;
&lt;P&gt;Count will be greater than 1 for any duplicates.&lt;/P&gt;</description>
      <pubDate>Thu, 31 Aug 2023 22:09:17 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/How-to-use-proc-sql-to-merge-and-match-by-unique-ID/m-p/892191#M352400</guid>
      <dc:creator>SASKiwi</dc:creator>
      <dc:date>2023-08-31T22:09:17Z</dc:date>
    </item>
    <item>
      <title>Re: How to use proc sql to merge and match by unique ID?</title>
      <link>https://communities.sas.com/t5/SAS-Programming/How-to-use-proc-sql-to-merge-and-match-by-unique-ID/m-p/892195#M352401</link>
      <description>&lt;P&gt;To define the appropriate join you need to know two things:&lt;/P&gt;
&lt;OL&gt;
&lt;LI&gt;What is the relationship between the two tables?
&lt;OL&gt;
&lt;LI&gt;1:1, 1:many, many:many&lt;/LI&gt;
&lt;/OL&gt;
&lt;/LI&gt;
&lt;LI&gt;What do you want as result?
&lt;OL&gt;
&lt;LI&gt;only the rows where the keys match (inner join)&lt;/LI&gt;
&lt;LI&gt;all rows from the one table but only matching rows from the left table (left or right join)&lt;/LI&gt;
&lt;LI&gt;all rows from both tables (full join)&lt;/LI&gt;
&lt;LI&gt;....&lt;/LI&gt;
&lt;/OL&gt;
&lt;/LI&gt;
&lt;/OL&gt;
&lt;P&gt;Below two links are hopefully useful to you.&lt;/P&gt;
&lt;P&gt;&lt;A href="https://blogs.sas.com/content/sastraining/2015/05/27/life-saver-tip-for-comparing-proc-sql-join-with-sas-data-step-merge/" target="_blank" rel="noopener"&gt;https://blogs.sas.com/content/sastraining/2015/05/27/life-saver-tip-for-comparing-proc-sql-join-with-sas-data-step-merge/&lt;/A&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&lt;A href="https://documentation.sas.com/doc/en/vdmmlcdc/8.1/sqlproc/p1bk7i6jqseje7n1lifcip8kzhpp.htm" target="_blank" rel="noopener"&gt;https://documentation.sas.com/doc/en/vdmmlcdc/8.1/sqlproc/p1bk7i6jqseje7n1lifcip8kzhpp.htm&lt;/A&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Thu, 31 Aug 2023 23:37:22 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/How-to-use-proc-sql-to-merge-and-match-by-unique-ID/m-p/892195#M352401</guid>
      <dc:creator>Patrick</dc:creator>
      <dc:date>2023-08-31T23:37:22Z</dc:date>
    </item>
    <item>
      <title>Re: How to use proc sql to merge and match by unique ID?</title>
      <link>https://communities.sas.com/t5/SAS-Programming/How-to-use-proc-sql-to-merge-and-match-by-unique-ID/m-p/892197#M352402</link>
      <description>This is helpful! Question though. I checked my data freq by the unique_id on a specific id #. For example dateset1 had 8 observations for that id#. Dataset2 had 16 for that specific id#. Joineddataset had 128 observations for that specific id#. Therefor it seemed like it multiplied? Why is that? Do I just deduplicate it?</description>
      <pubDate>Thu, 31 Aug 2023 23:53:19 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/How-to-use-proc-sql-to-merge-and-match-by-unique-ID/m-p/892197#M352402</guid>
      <dc:creator>rebelde52</dc:creator>
      <dc:date>2023-08-31T23:53:19Z</dc:date>
    </item>
    <item>
      <title>Re: How to use proc sql to merge and match by unique ID?</title>
      <link>https://communities.sas.com/t5/SAS-Programming/How-to-use-proc-sql-to-merge-and-match-by-unique-ID/m-p/892199#M352404</link>
      <description>&lt;BLOCKQUOTE&gt;&lt;HR /&gt;&lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/441125"&gt;@rebelde52&lt;/a&gt;&amp;nbsp;wrote:&lt;BR /&gt;This is helpful! Question though. I checked my data freq by the unique_id on a specific id #. For example dateset1 had 8 observations for that id#. Dataset2 had 16 for that specific id#. Joineddataset had 128 observations for that specific id#. Therefor it seemed like it multiplied? Why is that? Do I just deduplicate it?&lt;HR /&gt;&lt;/BLOCKQUOTE&gt;
&lt;P&gt;That means you've got a many:many relationship between your tables and unique_id is not unique.&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Even a SQL inner join will return 8X16 rows which is what you observe.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;How would you deduplicate? You would need for at least one of the tables all variable values of interest to be identical for a specific "unique" id. Is that the case? If not then which values would you pick in order to "dedup"?&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Fri, 01 Sep 2023 00:17:21 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/How-to-use-proc-sql-to-merge-and-match-by-unique-ID/m-p/892199#M352404</guid>
      <dc:creator>Patrick</dc:creator>
      <dc:date>2023-09-01T00:17:21Z</dc:date>
    </item>
  </channel>
</rss>

