<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: Convert proc sort nodupkey to proc sql in SAS Programming</title>
    <link>https://communities.sas.com/t5/SAS-Programming/Convert-proc-sort-nodupkey-to-proc-sql/m-p/600604#M173649</link>
    <description>wow that worked like a magic.</description>
    <pubDate>Thu, 31 Oct 2019 00:38:30 GMT</pubDate>
    <dc:creator>lydiawawa</dc:creator>
    <dc:date>2019-10-31T00:38:30Z</dc:date>
    <item>
      <title>Convert proc sort nodupkey to proc sql</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Convert-proc-sort-nodupkey-to-proc-sql/m-p/600566#M173629</link>
      <description>&lt;P&gt;Hi,&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;I'm trying to convert proc sort nodupkey to proc sql with a counter of distinct rows defined by 2 variables.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Dataset:&lt;/P&gt;&lt;P&gt;Have&lt;/P&gt;&lt;P&gt;key&amp;nbsp; mode&amp;nbsp; time&amp;nbsp; x&amp;nbsp; y z&lt;/P&gt;&lt;P&gt;1&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; int&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; 1&amp;nbsp;&amp;nbsp;&amp;nbsp; a&amp;nbsp; b&amp;nbsp; c&lt;/P&gt;&lt;P&gt;1&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; pho&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; 2&amp;nbsp;&amp;nbsp;&amp;nbsp; d&amp;nbsp;&amp;nbsp; a&amp;nbsp; b&lt;/P&gt;&lt;P&gt;2&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; int&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; 1&amp;nbsp;&amp;nbsp;&amp;nbsp; c&amp;nbsp;&amp;nbsp; v&amp;nbsp; a&lt;/P&gt;&lt;P&gt;3&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; pho&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; 1&amp;nbsp;&amp;nbsp; a&amp;nbsp;&amp;nbsp; b&amp;nbsp; c&lt;/P&gt;&lt;P&gt;3&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; pho&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; 2&amp;nbsp;&amp;nbsp; s&amp;nbsp; d&amp;nbsp;&amp;nbsp; e&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Want&lt;/P&gt;&lt;P&gt;key&amp;nbsp; mode&amp;nbsp; time&amp;nbsp; x&amp;nbsp; y z&lt;/P&gt;&lt;P&gt;1&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; int&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; 1&amp;nbsp;&amp;nbsp;&amp;nbsp; a&amp;nbsp; b&amp;nbsp; c&lt;/P&gt;&lt;P&gt;2&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; int&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; 1&amp;nbsp;&amp;nbsp;&amp;nbsp; c&amp;nbsp;&amp;nbsp; v&amp;nbsp; a&lt;/P&gt;&lt;P&gt;3&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; pho&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; 1&amp;nbsp;&amp;nbsp; a&amp;nbsp;&amp;nbsp; b&amp;nbsp; c&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;For proc sort, I first sorted the dataset by key, mode and time&lt;/P&gt;&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;proc sort data = have; by key mode time; run;&lt;/CODE&gt;&lt;/PRE&gt;&lt;P&gt;Then to remove duplicates, I would like to take the earliest time, which has already been taken care of by the previous proc sort:&lt;/P&gt;&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;proc sort data = have out = want nodupkey; by key mode; run;&lt;/CODE&gt;&lt;/PRE&gt;&lt;P&gt;I need to convert this procedure to proc sql with a counter that counts the distinct combo of key and mode and produce the same output&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;This is what I have, but is doesn't generate same obs number as proc sort:&lt;/P&gt;&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;  proc sql;
  CREATE TABLE want AS
  SELECT *, COUNT(DISTINCT(key||mode)) AS counter FROM want GROUP BY key, mode;
quit;&lt;/CODE&gt;&lt;/PRE&gt;&lt;P&gt;The following will produce the same number of obs as proc sort, but it will only let me keep key and mode:&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;    proc sql;
  create table want as
  select key, mode, count(distinct key||mode) as counter from (select distinct * from have) group by key,mode; quit;

&lt;/CODE&gt;&lt;/PRE&gt;&lt;P&gt;Grateful for any help!&lt;/P&gt;</description>
      <pubDate>Wed, 30 Oct 2019 22:02:51 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Convert-proc-sort-nodupkey-to-proc-sql/m-p/600566#M173629</guid>
      <dc:creator>lydiawawa</dc:creator>
      <dc:date>2019-10-30T22:02:51Z</dc:date>
    </item>
    <item>
      <title>Re: Convert proc sort nodupkey to proc sql</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Convert-proc-sort-nodupkey-to-proc-sql/m-p/600588#M173640</link>
      <description>&lt;P&gt;Your proc sort and your first sql code gave me the same results but based on your want this is what I have. Let me know if this works for you&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;DIV&gt;&lt;FONT style="background-color: #ffffff;"&gt;&lt;BR /&gt;data hAVE;&lt;BR /&gt;input key$&amp;nbsp; mode $&amp;nbsp; time&amp;nbsp; x $&amp;nbsp; y$ z $;&lt;BR /&gt;datalines;&lt;/FONT&gt;&lt;/DIV&gt;
&lt;DIV&gt;&lt;FONT style="background-color: #ffffff;"&gt;1&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; int&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; 1&amp;nbsp;&amp;nbsp;&amp;nbsp; a&amp;nbsp; b&amp;nbsp; c&lt;BR /&gt;1&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; pho&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; 2&amp;nbsp;&amp;nbsp;&amp;nbsp; d&amp;nbsp;&amp;nbsp; a&amp;nbsp; b&lt;BR /&gt;2&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; int&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; 1&amp;nbsp;&amp;nbsp;&amp;nbsp; c&amp;nbsp;&amp;nbsp; v&amp;nbsp; a&lt;BR /&gt;3&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; pho&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; 1&amp;nbsp;&amp;nbsp; a&amp;nbsp;&amp;nbsp; b&amp;nbsp; c&lt;BR /&gt;3&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; pho&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; 2&amp;nbsp;&amp;nbsp; s&amp;nbsp; d&amp;nbsp;&amp;nbsp; e&lt;BR /&gt;&amp;nbsp;&lt;BR /&gt;;&lt;BR /&gt;run;&lt;/FONT&gt;&lt;/DIV&gt;
&lt;DIV&gt;&lt;FONT style="background-color: #ffffff;"&gt;proc sql;&lt;BR /&gt;create table want as&lt;BR /&gt;select &lt;BR /&gt;key,&lt;BR /&gt;mode,&lt;BR /&gt;time,&lt;BR /&gt;x,&lt;BR /&gt;y,&lt;BR /&gt;z,&lt;BR /&gt;count(*) as Counter&lt;BR /&gt;from have&lt;BR /&gt;group by key&lt;BR /&gt;having time = min(time)&lt;BR /&gt;;&lt;BR /&gt;quit;&lt;/FONT&gt;&lt;/DIV&gt;</description>
      <pubDate>Wed, 30 Oct 2019 23:12:58 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Convert-proc-sort-nodupkey-to-proc-sql/m-p/600588#M173640</guid>
      <dc:creator>NewSASPerson</dc:creator>
      <dc:date>2019-10-30T23:12:58Z</dc:date>
    </item>
    <item>
      <title>Re: Convert proc sort nodupkey to proc sql</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Convert-proc-sort-nodupkey-to-proc-sql/m-p/600601#M173646</link>
      <description>&lt;P&gt;The raw dataset has over 50 variables. Therefore, I cannot select by variable names, I have to use select *. After I change the code into:&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;proc sql;
create table want as
select
*,
count(*) as Counter
from have
group by key, mode
having time = min(time)
;&lt;BR /&gt;
quit;&lt;/CODE&gt;&lt;/PRE&gt;&lt;P&gt;Or&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;proc sql;
create table want as
select
*,
count(*) as Counter
from have
group by key
having time = min(time)
;
quit;&lt;/CODE&gt;&lt;/PRE&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;The unduped outcome has more observation than proc sort.&lt;/P&gt;</description>
      <pubDate>Thu, 31 Oct 2019 00:28:50 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Convert-proc-sort-nodupkey-to-proc-sql/m-p/600601#M173646</guid>
      <dc:creator>lydiawawa</dc:creator>
      <dc:date>2019-10-31T00:28:50Z</dc:date>
    </item>
    <item>
      <title>Re: Convert proc sort nodupkey to proc sql</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Convert-proc-sort-nodupkey-to-proc-sql/m-p/600602#M173647</link>
      <description>&lt;P&gt;The WANT you indicated in your initial post didn't match the one you would get from the two proc sorts you showed.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;The following will, but uses an undocumented function, namely monotonic(), so I wouldn't suggest using it if this is production code. Also, while proc sql appears to read data sequentially, by definition there is no guarantee that it will always work that way:&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;data have;
  input key  mode $  time  (x  y z) ($);
  cards;
1       int        1    a  b  c
1       pho      1    d   a  b
1       pho      2    dd   aa  bb
2       int        1    c   v  a
3       pho      1   a   b  c
3       pho      1   aa   bb  cc
3       pho      2   aaa  bbb   ccc
;
run;

proc sql;
 create table need as
   select *, monotonic() as count
     from have
       group by key, mode, time
 ;
 create table want (drop=count) as
   select *
     from need
       group by key, mode
         having count=min(count)
 ;
quit;
&lt;/CODE&gt;&lt;/PRE&gt;
&lt;P&gt;Art, CEO, AnalystFinder.com&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Thu, 31 Oct 2019 00:31:48 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Convert-proc-sort-nodupkey-to-proc-sql/m-p/600602#M173647</guid>
      <dc:creator>art297</dc:creator>
      <dc:date>2019-10-31T00:31:48Z</dc:date>
    </item>
    <item>
      <title>Re: Convert proc sort nodupkey to proc sql</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Convert-proc-sort-nodupkey-to-proc-sql/m-p/600604#M173649</link>
      <description>wow that worked like a magic.</description>
      <pubDate>Thu, 31 Oct 2019 00:38:30 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Convert-proc-sort-nodupkey-to-proc-sql/m-p/600604#M173649</guid>
      <dc:creator>lydiawawa</dc:creator>
      <dc:date>2019-10-31T00:38:30Z</dc:date>
    </item>
    <item>
      <title>Re: Convert proc sort nodupkey to proc sql</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Convert-proc-sort-nodupkey-to-proc-sql/m-p/600605#M173650</link>
      <description>One more question: how do I know the unduped records take the minimum/earliest time in every key and mode group?</description>
      <pubDate>Thu, 31 Oct 2019 00:44:16 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Convert-proc-sort-nodupkey-to-proc-sql/m-p/600605#M173650</guid>
      <dc:creator>lydiawawa</dc:creator>
      <dc:date>2019-10-31T00:44:16Z</dc:date>
    </item>
    <item>
      <title>Re: Convert proc sort nodupkey to proc sql</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Convert-proc-sort-nodupkey-to-proc-sql/m-p/600606#M173651</link>
      <description>&lt;P&gt;That\s why I created file NEED .. so that you could see which records had the minimum count for each group.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Thu, 31 Oct 2019 00:55:38 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Convert-proc-sort-nodupkey-to-proc-sql/m-p/600606#M173651</guid>
      <dc:creator>art297</dc:creator>
      <dc:date>2019-10-31T00:55:38Z</dc:date>
    </item>
    <item>
      <title>Re: Convert proc sort nodupkey to proc sql</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Convert-proc-sort-nodupkey-to-proc-sql/m-p/600607#M173652</link>
      <description>Seems like monotonic() is very similar to _N_ and I think the order of time is probably taken care of by the group by statement?</description>
      <pubDate>Thu, 31 Oct 2019 00:58:28 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Convert-proc-sort-nodupkey-to-proc-sql/m-p/600607#M173652</guid>
      <dc:creator>lydiawawa</dc:creator>
      <dc:date>2019-10-31T00:58:28Z</dc:date>
    </item>
    <item>
      <title>Re: Convert proc sort nodupkey to proc sql</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Convert-proc-sort-nodupkey-to-proc-sql/m-p/600730#M173699</link>
      <description>&lt;P&gt;Like I said, the monotonic() function isn't documented, so users have only guessed at how it works (see:&amp;nbsp;&lt;A href="https://communities.sas.com/t5/SAS-Communities-Library/MONOTONIC-function-in-PROC-SQL/ta-p/475752" target="_blank"&gt;https://communities.sas.com/t5/SAS-Communities-Library/MONOTONIC-function-in-PROC-SQL/ta-p/475752&lt;/A&gt;&amp;nbsp;).&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;However, yes, I think the group statement would cause the function to work as you want it to. But, as it isn't documented, there are no guarantees.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Art, CEO, AnalystFinder.com&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Thu, 31 Oct 2019 15:46:07 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Convert-proc-sort-nodupkey-to-proc-sql/m-p/600730#M173699</guid>
      <dc:creator>art297</dc:creator>
      <dc:date>2019-10-31T15:46:07Z</dc:date>
    </item>
    <item>
      <title>Re: Convert proc sort nodupkey to proc sql</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Convert-proc-sort-nodupkey-to-proc-sql/m-p/600780#M173719</link>
      <description>&lt;P&gt;&lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/30435"&gt;@lydiawawa&lt;/a&gt;: Proc SQL is not exactly a tool suitable for record unduplication, especially in the situations where key ties have to be resolved based on which tied record comes first. That is why a sequential subkey must be added to the input, as &lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/13711"&gt;@art297&lt;/a&gt;&amp;nbsp;has shown in his post using the MONOTONIC() function, if such a situation should occur. In the input below, this kind of circumstance is represented by by record #1 tied by (key,mode,time) with record #2 added to the sample input I've pilfered from &lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/13711"&gt;@art297&lt;/a&gt;.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;But all of the above, plus adding a counter of the unique (key,mod) values, can be easily handled in a single step using the hash object:&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;data have ;                                                                                                                             
  input key mode $ time (x y z) ($) ;                                                                                                   
  cards;                                                                                                                                
1  int   1  a    b    c                                                                                                                 
1  int   1  a1   b1   c1                                                                                                                
1  pho   1  d    a    b                                                                                                                 
1  pho   2  dd   aa   bb                                                                                                                
2  int   1  c    v    a                                                                                                                 
3  pho   1  a    b    c                                                                                                                 
3  pho   1  aa   bb   cc                                                                                                                
3  pho   2  aaa  bbb  ccc                                                                                                               
;                                                                                                                                       
run ;                                                                                                                                   
                                                                                                                                        
data _null_ ;                                                                                                                           
  if _n_ = 1 then do ;                                                                                                                  
    dcl hash h (ordered:"a") ;                                                                                                          
    h.definekey ("key", "mode") ;                                                                                                       
    h.definedata ("key", "mode", "time", "x", "y", "z", "count") ;                                                                      
    h.definedone () ;                                                                                                                   
  end ;                                                                                                                                 
  set have end = lr ;                                                                                                                   
  _t = time ;                                                                                                                           
  if h.find() ne 0 then count = 1 ;                                                                                                     
  else                  count + 1 ;                                                                                                     
  if _t &amp;lt; time then time = _t ;                                                                                                         
  h.replace() ;                                                                                                                         
  if lr then h.output (dataset:"want") ;                                                                                                
run ;                  
&lt;/CODE&gt;&lt;/PRE&gt;
&lt;P&gt;Between the records 1 and 2, which cannot be deduped based on the time variable, record 1 is selected because it comes physically first. The output then looks as follows:&lt;/P&gt;
&lt;PRE&gt;key    mode    time    x    y    z    count                                                                                             
-------------------------------------------                                                                                             
 1     int       1     a    b    c      2                                                                                               
 1     pho       1     d    a    b      2                                                                                               
 2     int       1     c    v    a      1                                                                                               
 3     pho       1     a    b    c      3 
&lt;/PRE&gt;
&lt;P&gt;Kind regards&lt;/P&gt;
&lt;P&gt;Paul D. &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Thu, 31 Oct 2019 19:02:33 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Convert-proc-sort-nodupkey-to-proc-sql/m-p/600780#M173719</guid>
      <dc:creator>hashman</dc:creator>
      <dc:date>2019-10-31T19:02:33Z</dc:date>
    </item>
    <item>
      <title>Re: Convert proc sort nodupkey to proc sql</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Convert-proc-sort-nodupkey-to-proc-sql/m-p/600782#M173720</link>
      <description>wow you guys are like saints. Thank you so much for helping me understand the ambiguities.</description>
      <pubDate>Thu, 31 Oct 2019 19:09:23 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Convert-proc-sort-nodupkey-to-proc-sql/m-p/600782#M173720</guid>
      <dc:creator>lydiawawa</dc:creator>
      <dc:date>2019-10-31T19:09:23Z</dc:date>
    </item>
  </channel>
</rss>

