<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: Help with Removing Duplicate Rows from Proc SQL Inner Join in SAS Programming</title>
    <link>https://communities.sas.com/t5/SAS-Programming/Help-with-Removing-Duplicate-Rows-from-Proc-SQL-Inner-Join/m-p/638849#M189986</link>
    <description>&lt;P&gt;In general you cannot limit to one observation per value of TX_ID using SQL syntax.&amp;nbsp; That is just the nature of how the SQL language works.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;You could add the DISTINCT keyword to your SELECT and eliminate duplicate observations.&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;select distinct ....&lt;/CODE&gt;&lt;/PRE&gt;
&lt;P&gt;Otherwise to get down to just one observation per TX_ID you will need to use some SAS code instead of SQL code.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;You could use PROC SORT with the NODUPKEY option.&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;proc sort data=TX_HR out=TX_HR_SUBSET nodupkey ;
  by tx_id;
run;&lt;/CODE&gt;&lt;/PRE&gt;
&lt;P&gt;If it matter which observation you keep you could order the observations within the values ot TX_ID by some other variables that will insure the one you want is at the top and then use data step with BY group processing.&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;data tx_hr_subset;
  set tx_hr;
  by tx_id;
  if first.tx_id;
run;&lt;/CODE&gt;&lt;/PRE&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
    <pubDate>Fri, 10 Apr 2020 01:31:40 GMT</pubDate>
    <dc:creator>Tom</dc:creator>
    <dc:date>2020-04-10T01:31:40Z</dc:date>
    <item>
      <title>Help with Removing Duplicate Rows from Proc SQL Inner Join</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Help-with-Removing-Duplicate-Rows-from-Proc-SQL-Inner-Join/m-p/638846#M189984</link>
      <description>&lt;P&gt;Greetings All,&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;I am a new SAS user and am working on code to perform multiple inner joins using proc sql (Enterprise Guide 8.1). My output table contains multiple rows for each column (code below). I would love advise on how I can remove duplicate rows, keeping only 1 of each variable 'a.TX_ID'.&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;proc sql;
	create table TX_HR as 
	select distinct a.TX_ID, a.*, b.*, c.* 
		from SRTR.tx_hr as a left join SRTR.immuno as b 
			on a.TRR_ID = b.TRR_ID
		left join SRTR.rec_histo as c
			on a.TX_ID = c.REC_HISTO_TX_ID;
quit;&lt;/CODE&gt;&lt;/PRE&gt;&lt;P&gt;Thank you so much for your time.&lt;/P&gt;&lt;P&gt;Bill&lt;/P&gt;</description>
      <pubDate>Fri, 10 Apr 2020 01:07:26 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Help-with-Removing-Duplicate-Rows-from-Proc-SQL-Inner-Join/m-p/638846#M189984</guid>
      <dc:creator>wbaker0621</dc:creator>
      <dc:date>2020-04-10T01:07:26Z</dc:date>
    </item>
    <item>
      <title>Re: Help with Removing Duplicate Rows from Proc SQL Inner Join</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Help-with-Removing-Duplicate-Rows-from-Proc-SQL-Inner-Join/m-p/638849#M189986</link>
      <description>&lt;P&gt;In general you cannot limit to one observation per value of TX_ID using SQL syntax.&amp;nbsp; That is just the nature of how the SQL language works.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;You could add the DISTINCT keyword to your SELECT and eliminate duplicate observations.&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;select distinct ....&lt;/CODE&gt;&lt;/PRE&gt;
&lt;P&gt;Otherwise to get down to just one observation per TX_ID you will need to use some SAS code instead of SQL code.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;You could use PROC SORT with the NODUPKEY option.&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;proc sort data=TX_HR out=TX_HR_SUBSET nodupkey ;
  by tx_id;
run;&lt;/CODE&gt;&lt;/PRE&gt;
&lt;P&gt;If it matter which observation you keep you could order the observations within the values ot TX_ID by some other variables that will insure the one you want is at the top and then use data step with BY group processing.&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;data tx_hr_subset;
  set tx_hr;
  by tx_id;
  if first.tx_id;
run;&lt;/CODE&gt;&lt;/PRE&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Fri, 10 Apr 2020 01:31:40 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Help-with-Removing-Duplicate-Rows-from-Proc-SQL-Inner-Join/m-p/638849#M189986</guid>
      <dc:creator>Tom</dc:creator>
      <dc:date>2020-04-10T01:31:40Z</dc:date>
    </item>
    <item>
      <title>Re: Help with Removing Duplicate Rows from Proc SQL Inner Join</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Help-with-Removing-Duplicate-Rows-from-Proc-SQL-Inner-Join/m-p/638854#M189988</link>
      <description>This is great, thank you so much for the quick and detailed reply.</description>
      <pubDate>Fri, 10 Apr 2020 02:21:31 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Help-with-Removing-Duplicate-Rows-from-Proc-SQL-Inner-Join/m-p/638854#M189988</guid>
      <dc:creator>wbaker0621</dc:creator>
      <dc:date>2020-04-10T02:21:31Z</dc:date>
    </item>
  </channel>
</rss>

