<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: Selecting distinct Combinations of two variables in SAS Programming</title>
    <link>https://communities.sas.com/t5/SAS-Programming/Selecting-distinct-Combinations-of-two-variables/m-p/250630#M47269</link>
    <description>&lt;P&gt;If you use BY in a data step, the dataset needs to be sorted according to the BY statement.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Once you have the distinct combinations and want the remaining variables you need to make a decision which records to keep and which to drop.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
    <pubDate>Wed, 17 Feb 2016 14:45:09 GMT</pubDate>
    <dc:creator>Kurt_Bremser</dc:creator>
    <dc:date>2016-02-17T14:45:09Z</dc:date>
    <item>
      <title>Selecting distinct Combinations of two variables</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Selecting-distinct-Combinations-of-two-variables/m-p/250626#M47268</link>
      <description>&lt;P&gt;Hello, I want to select distinct cases with two variables, distinct "combination" of the two variables. One way is using data step, but do I need to have a sort step before this?&lt;/P&gt;
&lt;PRE&gt;data new_data;
set my_old_data;  
by ID date;
if first.ID and first.date;
run;&lt;/PRE&gt;
&lt;P&gt;another way is to use proc sql, but how to keep all other variables?&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;proc sql;
create new_table as
select distinct ID, data 
from my_old_data;
quit;&lt;/CODE&gt;&lt;/PRE&gt;
&lt;P&gt;Thanks&lt;/P&gt;</description>
      <pubDate>Wed, 17 Feb 2016 14:35:04 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Selecting-distinct-Combinations-of-two-variables/m-p/250626#M47268</guid>
      <dc:creator>fengyuwuzu</dc:creator>
      <dc:date>2016-02-17T14:35:04Z</dc:date>
    </item>
    <item>
      <title>Re: Selecting distinct Combinations of two variables</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Selecting-distinct-Combinations-of-two-variables/m-p/250630#M47269</link>
      <description>&lt;P&gt;If you use BY in a data step, the dataset needs to be sorted according to the BY statement.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Once you have the distinct combinations and want the remaining variables you need to make a decision which records to keep and which to drop.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Wed, 17 Feb 2016 14:45:09 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Selecting-distinct-Combinations-of-two-variables/m-p/250630#M47269</guid>
      <dc:creator>Kurt_Bremser</dc:creator>
      <dc:date>2016-02-17T14:45:09Z</dc:date>
    </item>
    <item>
      <title>Re: Selecting distinct Combinations of two variables</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Selecting-distinct-Combinations-of-two-variables/m-p/250634#M47270</link>
      <description>&lt;P&gt;Its depends on what your logic is. &amp;nbsp;In the datastep you provide yes you will have to sort the data, in the sql you may not get the same result as it is up to observation order which comes out as the distinct value. &amp;nbsp;First you need to define what variables should come out of the distinct, and example:&lt;BR /&gt;A &amp;nbsp; B &amp;nbsp; OTHER&lt;/P&gt;
&lt;P&gt;1 &amp;nbsp; 1 &amp;nbsp; ABC&lt;/P&gt;
&lt;P&gt;1 &amp;nbsp; 1 &amp;nbsp; DEF&lt;/P&gt;
&lt;P&gt;1 &amp;nbsp; 1 &amp;nbsp; EFG&lt;/P&gt;
&lt;P&gt;Now if I do a distinct on the above just for A/B then the output is 1 &amp;nbsp;1. &amp;nbsp;Simple. &amp;nbsp;But if I want the other data, I need to define which one of the available rows to take, should it be:&lt;BR /&gt;1 &amp;nbsp; 1 &amp;nbsp; ABC &amp;nbsp; &amp;nbsp;or 1 &amp;nbsp;1 EFG? &amp;nbsp; Do you see the problem? &amp;nbsp;With the datastep approach you will have sorted the data first, and it will take the first from the sort.&lt;/P&gt;
&lt;P&gt;Post example of all your data and how you want them ordered.&lt;/P&gt;</description>
      <pubDate>Wed, 17 Feb 2016 14:50:20 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Selecting-distinct-Combinations-of-two-variables/m-p/250634#M47270</guid>
      <dc:creator>RW9</dc:creator>
      <dc:date>2016-02-17T14:50:20Z</dc:date>
    </item>
    <item>
      <title>Re: Selecting distinct Combinations of two variables</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Selecting-distinct-Combinations-of-two-variables/m-p/250635#M47271</link>
      <description>&lt;P&gt;Thanks. Sort first is indeed necessary in the data step. I think in data step all other variables are automatically kept if I do not specify keep or drop.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;However, if I use the other method, ie proc sql, how to keep other varaibles which selecting the distinct combination of ID &amp;amp; date?&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Wed, 17 Feb 2016 14:52:15 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Selecting-distinct-Combinations-of-two-variables/m-p/250635#M47271</guid>
      <dc:creator>fengyuwuzu</dc:creator>
      <dc:date>2016-02-17T14:52:15Z</dc:date>
    </item>
    <item>
      <title>Re: Selecting distinct Combinations of two variables</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Selecting-distinct-Combinations-of-two-variables/m-p/250637#M47272</link>
      <description>&lt;P&gt;Consider this application of PROC SUMMARY to find the distinct values of age*sex&amp;nbsp;and the observation associated with the first occurrence.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;proc print data=sashelp.class;
   run;
proc summary data=sashelp.class nway;
   class age sex;
   output out=distinct(drop=_type_) idgroup(obs out[1](name height weight)=);
   run;
proc print;
   run;&lt;/CODE&gt;&lt;/PRE&gt;
&lt;P&gt;&lt;IMG src="https://communities.sas.com/t5/image/serverpage/image-id/1916iC10D20CC6D236838/image-size/original?v=mpbl-1&amp;amp;px=-1" border="0" alt="Capture.PNG" title="Capture.PNG" /&gt;&lt;/P&gt;</description>
      <pubDate>Wed, 17 Feb 2016 16:53:59 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Selecting-distinct-Combinations-of-two-variables/m-p/250637#M47272</guid>
      <dc:creator>data_null__</dc:creator>
      <dc:date>2016-02-17T16:53:59Z</dc:date>
    </item>
    <item>
      <title>Re: Selecting distinct Combinations of two variables</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Selecting-distinct-Combinations-of-two-variables/m-p/250642#M47273</link>
      <description>&lt;P&gt;Maybe this example is closer to what you actually wan to do. &amp;nbsp;Here SEX is like your ID variable and AGE is like DATE. &amp;nbsp;Just as if they had been sorted by SEX and AGE.&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;proc summary data=sashelp.class nway;
   class sex;
   output out=distinct(drop=_type_) idgroup(min(age) obs out[1](age name height weight)=);
   run;&lt;/CODE&gt;&lt;/PRE&gt;
&lt;P&gt;&lt;IMG src="https://communities.sas.com/t5/image/serverpage/image-id/1917i21EA474A64AFA34D/image-size/original?v=mpbl-1&amp;amp;px=-1" border="0" alt="Capture.PNG" title="Capture.PNG" /&gt;&lt;/P&gt;</description>
      <pubDate>Wed, 17 Feb 2016 15:11:18 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Selecting-distinct-Combinations-of-two-variables/m-p/250642#M47273</guid>
      <dc:creator>data_null__</dc:creator>
      <dc:date>2016-02-17T15:11:18Z</dc:date>
    </item>
    <item>
      <title>Re: Selecting distinct Combinations of two variables</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Selecting-distinct-Combinations-of-two-variables/m-p/250644#M47274</link>
      <description>&lt;P&gt;SQL is not really built for that. It is columns based, and this is more of a row (observation) based query.&lt;/P&gt;
&lt;P&gt;Unless you wish to do some fancy calculation of&amp;nbsp;which values to keep&amp;nbsp;from the other variables/columns, isn't this a simple PROC SORT NODUPKEY?&lt;/P&gt;</description>
      <pubDate>Wed, 17 Feb 2016 15:22:32 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Selecting-distinct-Combinations-of-two-variables/m-p/250644#M47274</guid>
      <dc:creator>LinusH</dc:creator>
      <dc:date>2016-02-17T15:22:32Z</dc:date>
    </item>
    <item>
      <title>Re: Selecting distinct Combinations of two variables</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Selecting-distinct-Combinations-of-two-variables/m-p/250653#M47281</link>
      <description>&lt;BLOCKQUOTE&gt;&lt;HR /&gt;&lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/13674"&gt;@LinusH&lt;/a&gt; wrote:&lt;BR /&gt;
&lt;P&gt;SQL is not really built for that. It is columns based, and this is more of a row (observation) based query.&lt;/P&gt;
&lt;P&gt;Unless you wish to do some fancy calculation of&amp;nbsp;which values to keep&amp;nbsp;from the other variables/columns, isn't this a simple PROC SORT NODUPKEY?&lt;/P&gt;
&lt;HR /&gt;&lt;/BLOCKQUOTE&gt;
&lt;P&gt;To get what I think &lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/56807"&gt;@fengyuwuzu&lt;/a&gt;&amp;nbsp;wants would take two sorts one to sort by ID and DATE and another to de-dup to distinct ID where the NODUPKEY would be used..&lt;BR /&gt;&lt;BR /&gt;&lt;/P&gt;</description>
      <pubDate>Wed, 17 Feb 2016 16:52:27 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Selecting-distinct-Combinations-of-two-variables/m-p/250653#M47281</guid>
      <dc:creator>data_null__</dc:creator>
      <dc:date>2016-02-17T16:52:27Z</dc:date>
    </item>
    <item>
      <title>Re: Selecting distinct Combinations of two variables</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Selecting-distinct-Combinations-of-two-variables/m-p/250673#M47285</link>
      <description>&lt;P&gt;Thank you all the replies. So nice to read all your replies. Learned a lot.&lt;/P&gt;</description>
      <pubDate>Wed, 17 Feb 2016 18:40:50 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Selecting-distinct-Combinations-of-two-variables/m-p/250673#M47285</guid>
      <dc:creator>fengyuwuzu</dc:creator>
      <dc:date>2016-02-17T18:40:50Z</dc:date>
    </item>
  </channel>
</rss>

