<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: Select distinct on different columns in SAS Programming</title>
    <link>https://communities.sas.com/t5/SAS-Programming/Select-distinct-on-different-columns/m-p/471137#M120653</link>
    <description>&lt;P&gt;I'm used to that functionality in PostgreSQL as well and had to resort to using PROC SORT.&amp;nbsp; However it does look like it works with PROC FEDSQL:&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;PROC FEDSQL;
	DROP TABLE example FORCE;
	CREATE TABLE example (
		id INTEGER,
		col1 INTEGER
	);
	INSERT INTO example(id, col1) VALUES (1, 2);
	INSERT INTO example(id, col1) VALUES (1, 3);
	INSERT INTO example(id, col1) VALUES (1, 4);
	SELECT DISTINCT ON (id) *
	FROM example;
QUIT;&lt;/CODE&gt;&lt;/PRE&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Result:&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;1&amp;nbsp; 4&lt;/P&gt;</description>
    <pubDate>Mon, 18 Jun 2018 15:53:04 GMT</pubDate>
    <dc:creator>tomcmacdonald</dc:creator>
    <dc:date>2018-06-18T15:53:04Z</dc:date>
    <item>
      <title>Select distinct on different columns</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Select-distinct-on-different-columns/m-p/471126#M120645</link>
      <description>&lt;DIV class="votecell post-layout--left"&gt;&lt;DIV class="vote"&gt;&amp;nbsp;&lt;/DIV&gt;&lt;DIV class="vote"&gt;&lt;SPAN class="vote-count-post "&gt;Hi,&lt;/SPAN&gt;&lt;/DIV&gt;&lt;/DIV&gt;&lt;DIV class="postcell post-layout--right"&gt;&lt;DIV class="post-text"&gt;&lt;P&gt;I am trying to create a query in sas from postgresql which eliminates duplicate rows based on the value on a subset of columns. Let's assume that my database is as follows:&lt;/P&gt;&lt;PRE class="lang-sql prettyprint prettyprinted"&gt;&lt;CODE&gt;&lt;SPAN class="pln"&gt;name var1 var2 var3
   a    &lt;/SPAN&gt;&lt;SPAN class="lit"&gt;1&lt;/SPAN&gt;   &lt;SPAN class="lit"&gt;2&lt;/SPAN&gt;   &lt;SPAN class="lit"&gt;10&lt;/SPAN&gt;&lt;SPAN class="pln"&gt;   &lt;BR /&gt;   a    &lt;/SPAN&gt;&lt;SPAN class="lit"&gt;1&lt;/SPAN&gt;   &lt;SPAN class="lit"&gt;2&lt;/SPAN&gt;   &lt;SPAN class="lit"&gt;26&lt;/SPAN&gt;&lt;SPAN class="pln"&gt;   &lt;BR /&gt;   b    &lt;/SPAN&gt;&lt;SPAN class="lit"&gt;3&lt;/SPAN&gt;   &lt;SPAN class="lit"&gt;56&lt;/SPAN&gt;  &lt;SPAN class="lit"&gt;47&lt;/SPAN&gt;&lt;SPAN class="pln"&gt;   &lt;BR /&gt;   c    &lt;/SPAN&gt;&lt;SPAN class="lit"&gt;4&lt;/SPAN&gt;   &lt;SPAN class="lit"&gt;78&lt;/SPAN&gt;  &lt;SPAN class="lit"&gt;50&lt;/SPAN&gt;&lt;/CODE&gt;&lt;/PRE&gt;&lt;P&gt;For my purposes I would like to drop the second row (or the first one, it doesn't really matter) in order to have the table as:&lt;/P&gt;&lt;PRE class="lang-sql prettyprint prettyprinted"&gt;&lt;CODE&gt;&lt;SPAN class="pln"&gt;name var1 var2 var3
   a    &lt;/SPAN&gt;&lt;SPAN class="lit"&gt;1&lt;/SPAN&gt;   &lt;SPAN class="lit"&gt;2&lt;/SPAN&gt;   &lt;SPAN class="lit"&gt;10&lt;/SPAN&gt;&lt;SPAN class="pln"&gt;   &lt;BR /&gt;   b    &lt;/SPAN&gt;&lt;SPAN class="lit"&gt;3&lt;/SPAN&gt;   &lt;SPAN class="lit"&gt;56&lt;/SPAN&gt;  &lt;SPAN class="lit"&gt;47&lt;/SPAN&gt;&lt;SPAN class="pln"&gt;   &lt;BR /&gt;   c    &lt;/SPAN&gt;&lt;SPAN class="lit"&gt;4&lt;/SPAN&gt;   &lt;SPAN class="lit"&gt;78&lt;/SPAN&gt;  &lt;SPAN class="lit"&gt;50&lt;/SPAN&gt;&lt;/CODE&gt;&lt;/PRE&gt;&lt;P&gt;I have tried withe the following:&lt;/P&gt;&lt;PRE class="lang-sql prettyprint prettyprinted"&gt;&lt;CODE&gt;&lt;SPAN class="kwd"&gt;SELECT&lt;/SPAN&gt; &lt;SPAN class="kwd"&gt;DISTINCT&lt;/SPAN&gt; &lt;SPAN class="kwd"&gt;on&lt;/SPAN&gt; &lt;SPAN class="pun"&gt;(&lt;/SPAN&gt;&lt;SPAN class="pln"&gt;name&lt;/SPAN&gt;&lt;SPAN class="pun"&gt;,&lt;/SPAN&gt;&lt;SPAN class="pln"&gt; var1&lt;/SPAN&gt;&lt;SPAN class="pun"&gt;,&lt;/SPAN&gt;&lt;SPAN class="pln"&gt; var2&lt;/SPAN&gt;&lt;SPAN class="pun"&gt;)&lt;BR /&gt;&lt;/SPAN&gt;name, var1, var2, var3
&lt;SPAN class="kwd"&gt;FROM&lt;/SPAN&gt; &lt;SPAN class="kwd"&gt;table&lt;/SPAN&gt;&lt;SPAN class="pun"&gt;;&lt;/SPAN&gt;&lt;/CODE&gt;&lt;/PRE&gt;&lt;P&gt;but without any success, can anyone help?&lt;/P&gt;&lt;P&gt;In Postgresq this would perform the distinct operation considering only the variables name, var1, var2 but keeping name,var1,var2,var3&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Thanks&lt;/P&gt;&lt;/DIV&gt;&lt;/DIV&gt;</description>
      <pubDate>Mon, 18 Jun 2018 15:25:09 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Select-distinct-on-different-columns/m-p/471126#M120645</guid>
      <dc:creator>Marco_park</dc:creator>
      <dc:date>2018-06-18T15:25:09Z</dc:date>
    </item>
    <item>
      <title>Re: Select distinct on different columns</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Select-distinct-on-different-columns/m-p/471129#M120646</link>
      <description>&lt;P&gt;I do not know what is&amp;nbsp;&lt;SPAN&gt;Postgresq&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;In SAS, you can achieve with a proc sort or data step&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;proc sort data=your_dataset nodupkey;&lt;/P&gt;&lt;P&gt;by name;&lt;/P&gt;&lt;P&gt;run;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Mon, 18 Jun 2018 15:34:00 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Select-distinct-on-different-columns/m-p/471129#M120646</guid>
      <dc:creator>novinosrin</dc:creator>
      <dc:date>2018-06-18T15:34:00Z</dc:date>
    </item>
    <item>
      <title>Re: Select distinct on different columns</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Select-distinct-on-different-columns/m-p/471130#M120647</link>
      <description>&lt;P&gt;Can I do that into a proc sql? I need it to be done in proc sql?&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Thank you&lt;/P&gt;</description>
      <pubDate>Mon, 18 Jun 2018 15:34:12 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Select-distinct-on-different-columns/m-p/471130#M120647</guid>
      <dc:creator>Marco_park</dc:creator>
      <dc:date>2018-06-18T15:34:12Z</dc:date>
    </item>
    <item>
      <title>Re: Select distinct on different columns</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Select-distinct-on-different-columns/m-p/471132#M120649</link>
      <description>&lt;P&gt;Something like this?&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;data have;
input name $ var1 var2 var3;
cards;
   a    1   2   10   
   a    1   2   26   
   b    3   56  47   
   c    4   78  50
   ;


proc sql;
create table want as
select distinct name, var1, var2, var3
from have
group by name
having var3=min(var3);
quit;&lt;/CODE&gt;&lt;/PRE&gt;</description>
      <pubDate>Mon, 18 Jun 2018 15:42:08 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Select-distinct-on-different-columns/m-p/471132#M120649</guid>
      <dc:creator>novinosrin</dc:creator>
      <dc:date>2018-06-18T15:42:08Z</dc:date>
    </item>
    <item>
      <title>Re: Select distinct on different columns</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Select-distinct-on-different-columns/m-p/471137#M120653</link>
      <description>&lt;P&gt;I'm used to that functionality in PostgreSQL as well and had to resort to using PROC SORT.&amp;nbsp; However it does look like it works with PROC FEDSQL:&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;PROC FEDSQL;
	DROP TABLE example FORCE;
	CREATE TABLE example (
		id INTEGER,
		col1 INTEGER
	);
	INSERT INTO example(id, col1) VALUES (1, 2);
	INSERT INTO example(id, col1) VALUES (1, 3);
	INSERT INTO example(id, col1) VALUES (1, 4);
	SELECT DISTINCT ON (id) *
	FROM example;
QUIT;&lt;/CODE&gt;&lt;/PRE&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Result:&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;1&amp;nbsp; 4&lt;/P&gt;</description>
      <pubDate>Mon, 18 Jun 2018 15:53:04 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Select-distinct-on-different-columns/m-p/471137#M120653</guid>
      <dc:creator>tomcmacdonald</dc:creator>
      <dc:date>2018-06-18T15:53:04Z</dc:date>
    </item>
    <item>
      <title>Re: Select distinct on different columns</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Select-distinct-on-different-columns/m-p/471353#M120729</link>
      <description>&lt;P&gt;Basically I have to translate this code&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;PRE&gt;proc sort data=my_db nodupkey;
	by var1 var2 var3 var4;
run;&lt;/PRE&gt;&lt;P&gt;in a proc sql, however the database has more then only 4 variables.&lt;/P&gt;&lt;P&gt;How can I do it?&lt;/P&gt;</description>
      <pubDate>Tue, 19 Jun 2018 12:51:41 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Select-distinct-on-different-columns/m-p/471353#M120729</guid>
      <dc:creator>Marco_park</dc:creator>
      <dc:date>2018-06-19T12:51:41Z</dc:date>
    </item>
    <item>
      <title>Re: Select distinct on different columns</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Select-distinct-on-different-columns/m-p/471367#M120739</link>
      <description>&lt;P&gt;This might work only if there are distinct values in Var3 for each group in name variable because if the values of Var3 are 10 and 10 against the data values "a" in name variable then you will get both the records.&lt;/P&gt;</description>
      <pubDate>Tue, 19 Jun 2018 14:01:49 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Select-distinct-on-different-columns/m-p/471367#M120739</guid>
      <dc:creator>MadhuKorni</dc:creator>
      <dc:date>2018-06-19T14:01:49Z</dc:date>
    </item>
    <item>
      <title>Re: Select distinct on different columns</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Select-distinct-on-different-columns/m-p/471368#M120740</link>
      <description>&lt;P&gt;Yes the entries in var1, var2 and var3 are all different. This is not the case in the var 4-10 but I don't care which ones will be dropped.&lt;/P&gt;&lt;P&gt;Is it possible to do it in a proc sql?&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Thanks&lt;/P&gt;</description>
      <pubDate>Tue, 19 Jun 2018 14:02:22 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Select-distinct-on-different-columns/m-p/471368#M120740</guid>
      <dc:creator>Marco_park</dc:creator>
      <dc:date>2018-06-19T14:02:22Z</dc:date>
    </item>
    <item>
      <title>Re: Select distinct on different columns</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Select-distinct-on-different-columns/m-p/471370#M120742</link>
      <description>&lt;P&gt;&lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/48764"&gt;@MadhuKorni&lt;/a&gt;&amp;nbsp;No, you obviously haven't tested my code. Your thought is handled by my having clause. Please test&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Tue, 19 Jun 2018 14:09:12 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Select-distinct-on-different-columns/m-p/471370#M120742</guid>
      <dc:creator>novinosrin</dc:creator>
      <dc:date>2018-06-19T14:09:12Z</dc:date>
    </item>
    <item>
      <title>Re: Select distinct on different columns</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Select-distinct-on-different-columns/m-p/471377#M120744</link>
      <description>&lt;P&gt;If the first three variables of a by group are same then your code works.&lt;/P&gt;
&lt;P&gt;What If the value in var2 of the 2nd row is changed to 1. Try executing your code with the below dataset.&lt;/P&gt;
&lt;P&gt;data have;&lt;BR /&gt;input name $ var1 var2 var3;&lt;BR /&gt;cards;&lt;BR /&gt;a 1 2 10 &lt;BR /&gt;a 1 1 10 &lt;BR /&gt;b 3 56 47 &lt;BR /&gt;c 4 78 50&lt;BR /&gt;;&lt;/P&gt;</description>
      <pubDate>Tue, 19 Jun 2018 14:31:16 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Select-distinct-on-different-columns/m-p/471377#M120744</guid>
      <dc:creator>MadhuKorni</dc:creator>
      <dc:date>2018-06-19T14:31:16Z</dc:date>
    </item>
    <item>
      <title>Re: Select distinct on different columns</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Select-distinct-on-different-columns/m-p/471383#M120745</link>
      <description>&lt;P&gt;&lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/48764"&gt;@MadhuKorni&lt;/a&gt;&amp;nbsp; Fair point, that leads me to now think statistically,&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Please test with some intuitive combinations and let me know&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;data have;
input name $ var1 var2 var3;
cards;
a 1 2 10 
a 1 1 10 
b 3 56 47 
c 4 78 50
;

proc sql;
create table want(drop=std) as
select  distinct name, var1, var2, var3,std(var1,var2,var3) as std
from have
group by name
having std=min(std)  ;
quit;&lt;/CODE&gt;&lt;/PRE&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;OK, The above may still cause a problem for values like&lt;/P&gt;&lt;P&gt;&lt;STRONG&gt;a 1 2 1 &lt;/STRONG&gt;&lt;BR /&gt;&lt;STRONG&gt;a 2 1 1&lt;/STRONG&gt;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;This can be fixed by weigting on of the variables multiplied with a constant. For example,&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;proc sql;
create table want(drop=std) as
select  distinct name, var1, var2, var3,std(var1*2,var2,var3) as std
from have
group by name
having min(std)=std;
quit;&lt;/CODE&gt;&lt;/PRE&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Tue, 19 Jun 2018 14:59:46 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Select-distinct-on-different-columns/m-p/471383#M120745</guid>
      <dc:creator>novinosrin</dc:creator>
      <dc:date>2018-06-19T14:59:46Z</dc:date>
    </item>
  </channel>
</rss>

