<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: Distinct count other than proc sql in SAS Programming</title>
    <link>https://communities.sas.com/t5/SAS-Programming/Distinct-count-other-than-proc-sql/m-p/545755#M151037</link>
    <description>&lt;P&gt;Do a proc freq and put all three variables as tables:&lt;/P&gt;
&lt;PRE&gt;proc freq data=have;
  tables product * brand * date / out=want;
run;&lt;/PRE&gt;
&lt;P&gt;This will give a you a large table with the various counts - note you will need to refine it (not going to type test data in to do this for you) to only get the results you want out, but it should be way faster than SQL.&amp;nbsp; Note, I assume all three are character variables, you say date, but do not convert it in the catx so you are either implicitly converting it (bad) or its text (again not optimal).&lt;/P&gt;</description>
    <pubDate>Mon, 25 Mar 2019 11:47:40 GMT</pubDate>
    <dc:creator>RW9</dc:creator>
    <dc:date>2019-03-25T11:47:40Z</dc:date>
    <item>
      <title>Distinct count other than proc sql</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Distinct-count-other-than-proc-sql/m-p/545753#M151035</link>
      <description>&lt;DIV class="lia-message-heading lia-component-message-header"&gt;&lt;DIV class="lia-quilt-row lia-quilt-row-standard"&gt;&lt;DIV class="lia-quilt-column lia-quilt-column-20 lia-quilt-column-left"&gt;&lt;DIV class="lia-quilt-column-alley lia-quilt-column-alley-left"&gt;&lt;DIV class="lia-message-subject"&gt;&amp;nbsp;&lt;/DIV&gt;&lt;/DIV&gt;&lt;/DIV&gt;&lt;/DIV&gt;&lt;/DIV&gt;&lt;P class="lia-message-dates lia-message-post-date lia-component-post-date-last-edited"&gt;It's more about optimization question. I have a table with 1b record. Now I want to check if the total record is matched with distinct count based on 3 different variables. Though I have a query which runs great on small data, but takes lots of time when I run this on huge data like 1b record.&lt;/P&gt;&lt;DIV class="lia-message-body"&gt;&lt;DIV class="lia-message-body-content"&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;table1&lt;/P&gt;&lt;TABLE&gt;&lt;TBODY&gt;&lt;TR&gt;&lt;TD&gt;product&lt;/TD&gt;&lt;TD&gt;brand&lt;/TD&gt;&lt;TD&gt;date&lt;/TD&gt;&lt;/TR&gt;&lt;TR&gt;&lt;TD&gt;1&lt;/TD&gt;&lt;TD&gt;b1&lt;/TD&gt;&lt;TD&gt;01/10/2013&lt;/TD&gt;&lt;/TR&gt;&lt;TR&gt;&lt;TD&gt;1&lt;/TD&gt;&lt;TD&gt;b3&lt;/TD&gt;&lt;TD&gt;03/10/2013&lt;/TD&gt;&lt;/TR&gt;&lt;TR&gt;&lt;TD&gt;1&lt;/TD&gt;&lt;TD&gt;b3&lt;/TD&gt;&lt;TD&gt;03/10/2013&lt;/TD&gt;&lt;/TR&gt;&lt;TR&gt;&lt;TD&gt;1&lt;/TD&gt;&lt;TD&gt;b4&lt;/TD&gt;&lt;TD&gt;04/10/2013&lt;/TD&gt;&lt;/TR&gt;&lt;TR&gt;&lt;TD&gt;2&lt;/TD&gt;&lt;TD&gt;b1&lt;/TD&gt;&lt;TD&gt;01/10/2013&lt;/TD&gt;&lt;/TR&gt;&lt;TR&gt;&lt;TD&gt;2&lt;/TD&gt;&lt;TD&gt;b2&lt;/TD&gt;&lt;TD&gt;01/10/2013&lt;/TD&gt;&lt;/TR&gt;&lt;TR&gt;&lt;TD&gt;2&lt;/TD&gt;&lt;TD&gt;b3&lt;/TD&gt;&lt;TD&gt;01/10/2013&lt;/TD&gt;&lt;/TR&gt;&lt;TR&gt;&lt;TD&gt;2&lt;/TD&gt;&lt;TD&gt;b4&lt;/TD&gt;&lt;TD&gt;01/10/2013&lt;/TD&gt;&lt;/TR&gt;&lt;TR&gt;&lt;TD&gt;1&lt;/TD&gt;&lt;TD&gt;b1&lt;/TD&gt;&lt;TD&gt;01/10/2013&lt;/TD&gt;&lt;/TR&gt;&lt;TR&gt;&lt;TD&gt;4&lt;/TD&gt;&lt;TD&gt;b1&lt;/TD&gt;&lt;TD&gt;02/10/2013&lt;/TD&gt;&lt;/TR&gt;&lt;TR&gt;&lt;TD&gt;5&lt;/TD&gt;&lt;TD&gt;b3&lt;/TD&gt;&lt;TD&gt;03/10/2013&lt;/TD&gt;&lt;/TR&gt;&lt;TR&gt;&lt;TD&gt;5&lt;/TD&gt;&lt;TD&gt;b3&lt;/TD&gt;&lt;TD&gt;03/10/2013&lt;/TD&gt;&lt;/TR&gt;&lt;TR&gt;&lt;TD&gt;6&lt;/TD&gt;&lt;TD&gt;b4&lt;/TD&gt;&lt;TD&gt;03/10/2013&lt;/TD&gt;&lt;/TR&gt;&lt;TR&gt;&lt;TD&gt;6&lt;/TD&gt;&lt;TD&gt;b4&lt;/TD&gt;&lt;TD&gt;04/10/2013&lt;/TD&gt;&lt;/TR&gt;&lt;/TBODY&gt;&lt;/TABLE&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;The query is;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;PRE class=" language-sas"&gt;&lt;CODE class="  language-sas"&gt;&lt;SPAN class="token procnames"&gt;data&lt;/SPAN&gt; have&lt;SPAN class="token punctuation"&gt;;&lt;/SPAN&gt;
&lt;SPAN class="token statement"&gt;infile&lt;/SPAN&gt; datalines dlm&lt;SPAN class="token operator"&gt;=&lt;/SPAN&gt;&lt;SPAN class="token string"&gt;'09'&lt;/SPAN&gt;x&lt;SPAN class="token punctuation"&gt;;&lt;/SPAN&gt;
&lt;SPAN class="token keyword"&gt;input&lt;/SPAN&gt; product brand &lt;SPAN class="token punctuation"&gt;$&lt;/SPAN&gt; &lt;SPAN class="token function"&gt;date&lt;/SPAN&gt; :mmddyy10&lt;SPAN class="token punctuation"&gt;.&lt;/SPAN&gt;&lt;SPAN class="token punctuation"&gt;;&lt;/SPAN&gt;
&lt;SPAN class="token procnames"&gt;format&lt;/SPAN&gt; &lt;SPAN class="token function"&gt;date&lt;/SPAN&gt; mmddyy10&lt;SPAN class="token punctuation"&gt;.&lt;/SPAN&gt;&lt;SPAN class="token punctuation"&gt;;&lt;/SPAN&gt;
&lt;SPAN class="token datalines"&gt;&lt;SPAN class="token keyword"&gt;datalines&lt;/SPAN&gt;&lt;SPAN class="token punctuation"&gt;;&lt;/SPAN&gt;&lt;SPAN class="token data string"&gt;1	b1	01/10/2013
1	b3	03/10/2013
1	b3	03/10/2013
1	b4	04/10/2013
2	b1	01/10/2013
2	b2	01/10/2013
2	b3	01/10/2013
2	b4	01/10/2013
1	b1	01/10/2013
4	b1	02/10/2013
5	b3	03/10/2013
5	b3	03/10/2013
6	b4	03/10/2013
6	b4	04/10/2013&lt;/SPAN&gt;&lt;SPAN class="token punctuation"&gt;;&lt;/SPAN&gt;&lt;/SPAN&gt;
&lt;SPAN class="token procnames"&gt;run&lt;/SPAN&gt;&lt;SPAN class="token punctuation"&gt;;&lt;/SPAN&gt;&lt;/CODE&gt;&lt;/PRE&gt;&lt;PRE&gt;&lt;BR /&gt;&lt;BR /&gt;&lt;BR /&gt;Proc SQL; Select Count(distinct(catx(product,brand,date))),count(*)&amp;nbsp; from have; quit;&lt;/PRE&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;I am using catx becuase if there any space in any of these 3 column, that will be removed.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Above query will give the output 11 &amp;amp; 14.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Is there any way to optimise this query. so I can run on the huge data set.&lt;/P&gt;&lt;/DIV&gt;&lt;/DIV&gt;</description>
      <pubDate>Mon, 25 Mar 2019 11:47:49 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Distinct-count-other-than-proc-sql/m-p/545753#M151035</guid>
      <dc:creator>Srigyan</dc:creator>
      <dc:date>2019-03-25T11:47:49Z</dc:date>
    </item>
    <item>
      <title>Re: Distinct count other than proc sql</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Distinct-count-other-than-proc-sql/m-p/545755#M151037</link>
      <description>&lt;P&gt;Do a proc freq and put all three variables as tables:&lt;/P&gt;
&lt;PRE&gt;proc freq data=have;
  tables product * brand * date / out=want;
run;&lt;/PRE&gt;
&lt;P&gt;This will give a you a large table with the various counts - note you will need to refine it (not going to type test data in to do this for you) to only get the results you want out, but it should be way faster than SQL.&amp;nbsp; Note, I assume all three are character variables, you say date, but do not convert it in the catx so you are either implicitly converting it (bad) or its text (again not optimal).&lt;/P&gt;</description>
      <pubDate>Mon, 25 Mar 2019 11:47:40 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Distinct-count-other-than-proc-sql/m-p/545755#M151037</guid>
      <dc:creator>RW9</dc:creator>
      <dc:date>2019-03-25T11:47:40Z</dc:date>
    </item>
    <item>
      <title>Re: Distinct count other than proc sql</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Distinct-count-other-than-proc-sql/m-p/545756#M151038</link>
      <description>&lt;P&gt;i put the code for example dataset. your query is giving me another huge table. I just need two number as an ouput&lt;/P&gt;&lt;P&gt;14&lt;/P&gt;&lt;P&gt;11&lt;/P&gt;</description>
      <pubDate>Mon, 25 Mar 2019 11:49:31 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Distinct-count-other-than-proc-sql/m-p/545756#M151038</guid>
      <dc:creator>Srigyan</dc:creator>
      <dc:date>2019-03-25T11:49:31Z</dc:date>
    </item>
    <item>
      <title>Re: Distinct count other than proc sql</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Distinct-count-other-than-proc-sql/m-p/545760#M151041</link>
      <description>&lt;P&gt;Refer to: "&lt;SPAN&gt;This will give a you a large table with the various counts - note you will need to refine it (not going to type test data in to do this for you) to only get the results you want out"&lt;/SPAN&gt;&lt;/P&gt;</description>
      <pubDate>Mon, 25 Mar 2019 11:52:06 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Distinct-count-other-than-proc-sql/m-p/545760#M151041</guid>
      <dc:creator>RW9</dc:creator>
      <dc:date>2019-03-25T11:52:06Z</dc:date>
    </item>
    <item>
      <title>Re: Distinct count other than proc sql</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Distinct-count-other-than-proc-sql/m-p/545761#M151042</link>
      <description>I know proc freq, my query is getting these numbers. Whats the point of this answer which is not gonna help... I am not very good in SAS thats why I am putting this question here.Any way if you can't, thanks for your effort.</description>
      <pubDate>Mon, 25 Mar 2019 11:54:18 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Distinct-count-other-than-proc-sql/m-p/545761#M151042</guid>
      <dc:creator>Srigyan</dc:creator>
      <dc:date>2019-03-25T11:54:18Z</dc:date>
    </item>
    <item>
      <title>Re: Distinct count other than proc sql</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Distinct-count-other-than-proc-sql/m-p/545771#M151044</link>
      <description>&lt;BLOCKQUOTE&gt;&lt;HR /&gt;&lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/200312"&gt;@Srigyan&lt;/a&gt;&amp;nbsp;wrote:&lt;BR /&gt;Whats the point of this answer which is not gonna help...&amp;nbsp;&lt;HR /&gt;&lt;/BLOCKQUOTE&gt;
&lt;P&gt;What's the point of insisting that you can't use PROC SQL, when it is the perfect tool for this situation and gets you the exact answer that you want?&lt;/P&gt;</description>
      <pubDate>Mon, 25 Mar 2019 12:19:27 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Distinct-count-other-than-proc-sql/m-p/545771#M151044</guid>
      <dc:creator>PaigeMiller</dc:creator>
      <dc:date>2019-03-25T12:19:27Z</dc:date>
    </item>
    <item>
      <title>Re: Distinct count other than proc sql</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Distinct-count-other-than-proc-sql/m-p/545773#M151045</link>
      <description>Because I am running this proc sql on more than a billioin record and running proc sql will take lot of time. So I am looking for an optimisation. Hope you can help.</description>
      <pubDate>Mon, 25 Mar 2019 12:23:57 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Distinct-count-other-than-proc-sql/m-p/545773#M151045</guid>
      <dc:creator>Srigyan</dc:creator>
      <dc:date>2019-03-25T12:23:57Z</dc:date>
    </item>
    <item>
      <title>Re: Distinct count other than proc sql</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Distinct-count-other-than-proc-sql/m-p/545781#M151052</link>
      <description>&lt;P&gt;Firstly make an index on it ,and no need catx() which can make your code very slow .&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;data have;
infile datalines expandtabs ;
input product brand $ date :mmddyy10.;
format date mmddyy10.;
datalines;
1	b1	01/10/2013
1	b3	03/10/2013
1	b3	03/10/2013
1	b4	04/10/2013
2	b1	01/10/2013
2	b2	01/10/2013
2	b3	01/10/2013
2	b4	01/10/2013
1	b1	01/10/2013
4	b1	02/10/2013
5	b3	03/10/2013
5	b3	03/10/2013
6	b4	03/10/2013
6	b4	04/10/2013
;
run;

options bufsize=128K bufno=100;
Proc SQL; 
create index x on have(product,brand,date);

Select Count(*)
from (select distinct product,brand,date from have); 
select count(*) from have;
quit;&lt;/CODE&gt;&lt;/PRE&gt;</description>
      <pubDate>Mon, 25 Mar 2019 12:37:55 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Distinct-count-other-than-proc-sql/m-p/545781#M151052</guid>
      <dc:creator>Ksharp</dc:creator>
      <dc:date>2019-03-25T12:37:55Z</dc:date>
    </item>
    <item>
      <title>Re: Distinct count other than proc sql</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Distinct-count-other-than-proc-sql/m-p/545786#M151055</link>
      <description>&lt;P&gt;Perfect!!!&lt;/P&gt;&lt;P&gt;Thanks a lot.&lt;img id="smileyhappy" class="emoticon emoticon-smileyhappy" src="https://communities.sas.com/i/smilies/16x16_smiley-happy.png" alt="Smiley Happy" title="Smiley Happy" /&gt;&lt;/P&gt;</description>
      <pubDate>Mon, 25 Mar 2019 12:56:20 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Distinct-count-other-than-proc-sql/m-p/545786#M151055</guid>
      <dc:creator>Srigyan</dc:creator>
      <dc:date>2019-03-25T12:56:20Z</dc:date>
    </item>
    <item>
      <title>Re: Distinct count other than proc sql</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Distinct-count-other-than-proc-sql/m-p/545787#M151056</link>
      <description>&lt;P&gt;Are you merely trying to generate a true/false result on whether the cardinality of the 3-variable-combination is the same as the number of obs in the data set?&amp;nbsp; Or do you need to know exactly how many 3-var-combos there are?&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;If you just want the true/false, then&lt;/P&gt;
&lt;OL&gt;
&lt;LI&gt;you can stop checking your data upon discovery of&amp;nbsp;the first duplicate, which might occur long before you read then entire file.
&lt;OL&gt;
&lt;LI&gt;If&amp;nbsp; the data are sorted, as&amp;nbsp; in your example then your can&lt;BR /&gt;&lt;BR /&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;data _null_;
  set have end=end_of_have;
  by product brand date;
  if last.date=0 then do;
    put 'found duplicate at ' _N_= (product brand date) (=);
	stop;
  end;
  if end_of_have then put  'No duplicates for dataset of ' _n_ ' observations.';
run;&lt;/CODE&gt;&lt;/PRE&gt;
&lt;/LI&gt;
&lt;LI&gt;But if it's NOT sorted the I would recommend a hash object to track the combinations already discovered.&lt;BR /&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;data _null_;
  set have end=end_of_have;
  if _n_=1 then do;
    declare hash h (dataset:'have (obs=0 keep=product brand date)');
	  h.definekey(all:'Y');
	  h.definedone();
  end;
  if h.add()^=0 then do;
    put 'found duplicate at ' _N_= (product brand date) (=);
	stop;
  end;
  if end_of_have then put  'No duplicates for dataset of ' _n_ ' observations.';
run;&lt;/CODE&gt;&lt;/PRE&gt;
&lt;/LI&gt;
&lt;/OL&gt;
&lt;/LI&gt;
&lt;LI&gt;But if&amp;nbsp; you actually want frequencies of each combination, then a modification of the two above (depending on whether the data is sorted) will be needed.&lt;/LI&gt;
&lt;/OL&gt;</description>
      <pubDate>Mon, 25 Mar 2019 12:58:02 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Distinct-count-other-than-proc-sql/m-p/545787#M151056</guid>
      <dc:creator>mkeintz</dc:creator>
      <dc:date>2019-03-25T12:58:02Z</dc:date>
    </item>
    <item>
      <title>Re: Distinct count other than proc sql</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Distinct-count-other-than-proc-sql/m-p/545790#M151057</link>
      <description>&lt;P&gt;Using CATX() is going to give you the wrong answer.&amp;nbsp; Especially misusing it by using one of the variables as the delimiter instead of constant.&lt;/P&gt;
&lt;P&gt;It will map different sets of values to the same resulting string, thereby giving you an under count of the combinations.&lt;/P&gt;</description>
      <pubDate>Mon, 25 Mar 2019 13:11:20 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Distinct-count-other-than-proc-sql/m-p/545790#M151057</guid>
      <dc:creator>Tom</dc:creator>
      <dc:date>2019-03-25T13:11:20Z</dc:date>
    </item>
    <item>
      <title>Re: Distinct count other than proc sql</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Distinct-count-other-than-proc-sql/m-p/545791#M151058</link>
      <description>&lt;P&gt;Hi&amp;nbsp;&lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/18408"&gt;@Ksharp&lt;/a&gt;&amp;nbsp; &amp;nbsp;As usual your solutions are spot on and great. I am just wondering how the index which indeed is an implicit sort optimizes performance if indeed it is an extra pass. I lack clarity in the trade-off to determine how sometimes index can be costly or efficient not necessarily pertaining to this thread.&lt;/P&gt;
&lt;P&gt;Of course I am reading some papers of Michael raithel et al but not really getting a through grasping of performance of B-tree should the &lt;U&gt;&lt;EM&gt;unique combination be very large in proportion to the original count&lt;/EM&gt;&lt;/U&gt;.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;PS&lt;/P&gt;
&lt;P&gt;If it's coming from you, it's ought to be good. I can conform to blind obedience.&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Mon, 25 Mar 2019 13:12:50 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Distinct-count-other-than-proc-sql/m-p/545791#M151058</guid>
      <dc:creator>novinosrin</dc:creator>
      <dc:date>2019-03-25T13:12:50Z</dc:date>
    </item>
    <item>
      <title>Re: Distinct count other than proc sql</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Distinct-count-other-than-proc-sql/m-p/545802#M151063</link>
      <description>&lt;BLOCKQUOTE&gt;&lt;HR /&gt;&lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/200312"&gt;@Srigyan&lt;/a&gt;&amp;nbsp;wrote:&lt;BR /&gt;I know proc freq, my query is getting these numbers. Whats the point of this answer which is not gonna help... I am not very good in SAS thats why I am putting this question here.Any way if you can't, thanks for your effort.&lt;HR /&gt;&lt;/BLOCKQUOTE&gt;
&lt;P&gt;Then it's ******* time you start to learn how to really use SAS, and that's where we're trying to help you. Rejecting that help won't make you lots of friends.&lt;/P&gt;
&lt;P&gt;The time to learn how to properly use the tools available is NOW. See Maxim 13.&lt;/P&gt;</description>
      <pubDate>Mon, 25 Mar 2019 13:33:33 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Distinct-count-other-than-proc-sql/m-p/545802#M151063</guid>
      <dc:creator>Kurt_Bremser</dc:creator>
      <dc:date>2019-03-25T13:33:33Z</dc:date>
    </item>
    <item>
      <title>Re: Distinct count other than proc sql</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Distinct-count-other-than-proc-sql/m-p/545825#M151067</link>
      <description>&lt;P&gt;Thanks, I definetly look in to this. It is just I had some delivery.&lt;/P&gt;</description>
      <pubDate>Mon, 25 Mar 2019 14:02:52 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Distinct-count-other-than-proc-sql/m-p/545825#M151067</guid>
      <dc:creator>Srigyan</dc:creator>
      <dc:date>2019-03-25T14:02:52Z</dc:date>
    </item>
    <item>
      <title>Re: Distinct count other than proc sql</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Distinct-count-other-than-proc-sql/m-p/546086#M151157</link>
      <description>&lt;P&gt;Hi nov,&lt;/P&gt;
&lt;P&gt;"&lt;SPAN&gt;I am just wondering how the index which indeed is an implicit sort optimizes performance if indeed it is an extra pass."&lt;/SPAN&gt;&lt;/P&gt;
&lt;P&gt;&lt;SPAN&gt;you are right. it is indeed an extra pass if OP only run this code once. But if OP run this code many times ,it would be efficient .&lt;/SPAN&gt;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&lt;SPAN&gt;And for&amp;nbsp; " select count(*) from have ", if OP's dataset is in locale&amp;nbsp; ,then the following code would be efficient.&lt;/SPAN&gt;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&lt;SPAN&gt;%let dsid=%sysfunc(open(have));&lt;/SPAN&gt;&lt;/P&gt;
&lt;P&gt;&lt;SPAN&gt;%let nobs=%sysfunc(attrn(&amp;amp;dsid,nlobs));&lt;/SPAN&gt;&lt;/P&gt;
&lt;P&gt;&lt;SPAN&gt;%let dsid=%sysfunc(close(&amp;amp;dsid));&lt;/SPAN&gt;&lt;/P&gt;
&lt;P&gt;%put &amp;amp;nobs ;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&lt;SPAN&gt;P.S. I am not a perfect one, sometimes I would make some stupid errors ,so don't matter my code.&lt;/SPAN&gt;&lt;/P&gt;
&lt;P&gt;&lt;SPAN&gt;You could do better than me. &lt;span class="lia-unicode-emoji" title=":slightly_smiling_face:"&gt;🙂&lt;/span&gt;&amp;nbsp;&lt;/SPAN&gt;&lt;/P&gt;</description>
      <pubDate>Tue, 26 Mar 2019 11:53:38 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Distinct-count-other-than-proc-sql/m-p/546086#M151157</guid>
      <dc:creator>Ksharp</dc:creator>
      <dc:date>2019-03-26T11:53:38Z</dc:date>
    </item>
  </channel>
</rss>

