<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: What does &amp;quot;unique&amp;quot; do here for the count function alongside group by statement? in SAS Programming</title>
    <link>https://communities.sas.com/t5/SAS-Programming/What-does-quot-unique-quot-do-here-for-the-count-function/m-p/888808#M351122</link>
    <description>That makes sense - so if PERSON_ID is always not missing then count(*) = count(person_id)? Also, how does that work with the group by statement? I see &lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/3492"&gt;@JosvanderVelden&lt;/a&gt; 's reply, but the post they linked only says there's a caveat with the aggregate function, but doesn't say what the caveat is.</description>
    <pubDate>Thu, 10 Aug 2023 15:25:46 GMT</pubDate>
    <dc:creator>newbatprocsql1</dc:creator>
    <dc:date>2023-08-10T15:25:46Z</dc:date>
    <item>
      <title>What does "unique" do here for the count function alongside group by statement?</title>
      <link>https://communities.sas.com/t5/SAS-Programming/What-does-quot-unique-quot-do-here-for-the-count-function/m-p/888805#M351119</link>
      <description>&lt;P&gt;I'm pretty new to proc sql, but I was trying to count the number of people. Each unique person can have multiple rows (multiple application submissions). An example of a table below is:&amp;nbsp;&amp;nbsp;&lt;/P&gt;&lt;TABLE border="1"&gt;&lt;TBODY&gt;&lt;TR&gt;&lt;TD&gt;Person_ID&lt;/TD&gt;&lt;TD&gt;Category&lt;/TD&gt;&lt;TD&gt;Ref_type&lt;/TD&gt;&lt;TD&gt;Total_amt&lt;/TD&gt;&lt;/TR&gt;&lt;TR&gt;&lt;TD&gt;100&lt;/TD&gt;&lt;TD&gt;Green&lt;/TD&gt;&lt;TD&gt;2&lt;/TD&gt;&lt;TD&gt;350&lt;/TD&gt;&lt;/TR&gt;&lt;TR&gt;&lt;TD&gt;100&lt;/TD&gt;&lt;TD&gt;Blue&lt;/TD&gt;&lt;TD&gt;2&lt;/TD&gt;&lt;TD&gt;300&lt;/TD&gt;&lt;/TR&gt;&lt;TR&gt;&lt;TD&gt;100&lt;/TD&gt;&lt;TD&gt;Red&lt;/TD&gt;&lt;TD&gt;3&lt;/TD&gt;&lt;TD&gt;100&lt;/TD&gt;&lt;/TR&gt;&lt;TR&gt;&lt;TD&gt;200&lt;/TD&gt;&lt;TD&gt;Green&lt;/TD&gt;&lt;TD&gt;1&lt;/TD&gt;&lt;TD&gt;20&lt;/TD&gt;&lt;/TR&gt;&lt;TR&gt;&lt;TD&gt;200&lt;/TD&gt;&lt;TD&gt;Black&lt;/TD&gt;&lt;TD&gt;3&lt;/TD&gt;&lt;TD&gt;500&lt;/TD&gt;&lt;/TR&gt;&lt;TR&gt;&lt;TD&gt;300&lt;/TD&gt;&lt;TD&gt;Blue&lt;/TD&gt;&lt;TD&gt;2&lt;/TD&gt;&lt;TD&gt;200&lt;/TD&gt;&lt;/TR&gt;&lt;/TBODY&gt;&lt;/TABLE&gt;&lt;P&gt;&lt;BR /&gt;I want to count the number of people in each Category*Ref_type, so it'll be&lt;/P&gt;&lt;TABLE border="1"&gt;&lt;TBODY&gt;&lt;TR&gt;&lt;TD&gt;Category&lt;/TD&gt;&lt;TD&gt;Ref_type&lt;/TD&gt;&lt;TD&gt;No_ppl&lt;/TD&gt;&lt;/TR&gt;&lt;TR&gt;&lt;TD&gt;Green&lt;/TD&gt;&lt;TD&gt;1&lt;/TD&gt;&lt;TD&gt;1&lt;/TD&gt;&lt;/TR&gt;&lt;TR&gt;&lt;TD&gt;Green&lt;/TD&gt;&lt;TD&gt;2&lt;/TD&gt;&lt;TD&gt;2&lt;/TD&gt;&lt;/TR&gt;&lt;TR&gt;&lt;TD&gt;Blue&lt;/TD&gt;&lt;TD&gt;2&lt;/TD&gt;&lt;TD&gt;2&lt;/TD&gt;&lt;/TR&gt;&lt;TR&gt;&lt;TD&gt;Red&lt;/TD&gt;&lt;TD&gt;3&lt;/TD&gt;&lt;TD&gt;1&lt;/TD&gt;&lt;/TR&gt;&lt;TR&gt;&lt;TD&gt;Black&lt;/TD&gt;&lt;TD&gt;3&lt;/TD&gt;&lt;TD&gt;1&lt;/TD&gt;&lt;/TR&gt;&lt;/TBODY&gt;&lt;/TABLE&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;I'm not at home right now so I can't check but I remember that I tried doing&lt;/P&gt;&lt;PRE&gt;&lt;CODE class=""&gt;proc sql;
create table want as
select Category
,Ref_type
,count(unique Person_ID)
from have 
group by Category, Ref_type
;quit;&lt;/CODE&gt;&lt;/PRE&gt;&lt;P&gt;which gave me not exactly what I wanted (note that this was for a large dataset, probably 2 million rows). When I removed the "unique" keyword, I got what I wanted.&lt;/P&gt;&lt;P&gt;Can someone explain to me what the code with the "unique" when I'm using it with the count function, and a group by statement, and also what the code without the "unique" function does when I'm counting a specific variable that's not in the group by statement?&lt;/P&gt;</description>
      <pubDate>Thu, 10 Aug 2023 15:02:41 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/What-does-quot-unique-quot-do-here-for-the-count-function/m-p/888805#M351119</guid>
      <dc:creator>newbatprocsql1</dc:creator>
      <dc:date>2023-08-10T15:02:41Z</dc:date>
    </item>
    <item>
      <title>Re: What does "unique" do here for the count function alongside group by statement?</title>
      <link>https://communities.sas.com/t5/SAS-Programming/What-does-quot-unique-quot-do-here-for-the-count-function/m-p/888806#M351120</link>
      <description>Have a look at the solution for this track: &lt;A href="https://communities.sas.com/t5/SAS-Procedures/unique-function-in-proc-sql/td-p/251359" target="_blank"&gt;https://communities.sas.com/t5/SAS-Procedures/unique-function-in-proc-sql/td-p/251359&lt;/A&gt;</description>
      <pubDate>Thu, 10 Aug 2023 15:08:45 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/What-does-quot-unique-quot-do-here-for-the-count-function/m-p/888806#M351120</guid>
      <dc:creator>JosvanderVelden</dc:creator>
      <dc:date>2023-08-10T15:08:45Z</dc:date>
    </item>
    <item>
      <title>Re: What does "unique" do here for the count function alongside group by statement?</title>
      <link>https://communities.sas.com/t5/SAS-Programming/What-does-quot-unique-quot-do-here-for-the-count-function/m-p/888807#M351121</link>
      <description>&lt;P&gt;How many people do you think you have in your fist listing?&lt;/P&gt;
&lt;BLOCKQUOTE&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;TABLE border="1"&gt;
&lt;TBODY&gt;
&lt;TR&gt;
&lt;TD&gt;Person_ID&lt;/TD&gt;
&lt;TD&gt;Category&lt;/TD&gt;
&lt;TD&gt;Ref_type&lt;/TD&gt;
&lt;TD&gt;Total_amt&lt;/TD&gt;
&lt;/TR&gt;
&lt;TR&gt;
&lt;TD&gt;100&lt;/TD&gt;
&lt;TD&gt;Green&lt;/TD&gt;
&lt;TD&gt;2&lt;/TD&gt;
&lt;TD&gt;350&lt;/TD&gt;
&lt;/TR&gt;
&lt;TR&gt;
&lt;TD&gt;100&lt;/TD&gt;
&lt;TD&gt;Blue&lt;/TD&gt;
&lt;TD&gt;2&lt;/TD&gt;
&lt;TD&gt;300&lt;/TD&gt;
&lt;/TR&gt;
&lt;TR&gt;
&lt;TD&gt;100&lt;/TD&gt;
&lt;TD&gt;Red&lt;/TD&gt;
&lt;TD&gt;3&lt;/TD&gt;
&lt;TD&gt;100&lt;/TD&gt;
&lt;/TR&gt;
&lt;TR&gt;
&lt;TD&gt;200&lt;/TD&gt;
&lt;TD&gt;Green&lt;/TD&gt;
&lt;TD&gt;1&lt;/TD&gt;
&lt;TD&gt;20&lt;/TD&gt;
&lt;/TR&gt;
&lt;TR&gt;
&lt;TD&gt;200&lt;/TD&gt;
&lt;TD&gt;Black&lt;/TD&gt;
&lt;TD&gt;3&lt;/TD&gt;
&lt;TD&gt;500&lt;/TD&gt;
&lt;/TR&gt;
&lt;TR&gt;
&lt;TD&gt;300&lt;/TD&gt;
&lt;TD&gt;Blue&lt;/TD&gt;
&lt;TD&gt;2&lt;/TD&gt;
&lt;TD&gt;200&lt;/TD&gt;
&lt;/TR&gt;
&lt;/TBODY&gt;
&lt;/TABLE&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;/BLOCKQUOTE&gt;
&lt;P&gt;I see 6 observations for 3 people.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;COUNT(*) will count how many observations.&lt;/P&gt;
&lt;P&gt;COUNT(PERSON_ID) will count how many observations have a non-missing value of PERSON_ID.&lt;/P&gt;
&lt;P&gt;COUNT(DISTINCT PERSON_ID) will count how many people.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Thu, 10 Aug 2023 15:17:34 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/What-does-quot-unique-quot-do-here-for-the-count-function/m-p/888807#M351121</guid>
      <dc:creator>Tom</dc:creator>
      <dc:date>2023-08-10T15:17:34Z</dc:date>
    </item>
    <item>
      <title>Re: What does "unique" do here for the count function alongside group by statement?</title>
      <link>https://communities.sas.com/t5/SAS-Programming/What-does-quot-unique-quot-do-here-for-the-count-function/m-p/888808#M351122</link>
      <description>That makes sense - so if PERSON_ID is always not missing then count(*) = count(person_id)? Also, how does that work with the group by statement? I see &lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/3492"&gt;@JosvanderVelden&lt;/a&gt; 's reply, but the post they linked only says there's a caveat with the aggregate function, but doesn't say what the caveat is.</description>
      <pubDate>Thu, 10 Aug 2023 15:25:46 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/What-does-quot-unique-quot-do-here-for-the-count-function/m-p/888808#M351122</guid>
      <dc:creator>newbatprocsql1</dc:creator>
      <dc:date>2023-08-10T15:25:46Z</dc:date>
    </item>
    <item>
      <title>Re: What does "unique" do here for the count function alongside group by statement?</title>
      <link>https://communities.sas.com/t5/SAS-Programming/What-does-quot-unique-quot-do-here-for-the-count-function/m-p/888810#M351123</link>
      <description>&lt;P&gt;GROUP BY separates the observations into groups based on the values of the variables listed.&amp;nbsp; When you have GROUP BY the aggregate functions, like COUNT(),MAX(),SUM(), etc operate over each group separately.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Also when you have GROUP BY you need to take care what columns (variables) you list in the SELECT clause.&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;data have;
  input Person_ID $ Category $ Ref_type $ Total_amt;
cards;
100 Green 2 350
100 Blue 2 300
100 Red 3 100
200 Green 1 20
200 Black 3 500
300 Blue 2 200
;

proc sql ;
 select category,ref_type
      , count(*) as nobs
      , count(person_id) as n_with_ids
      , count(person_id) as n_people
      , sum(total_amt) as sum_total_amt
   from have
   group by category,ref_type
 ;
quit;&lt;/CODE&gt;&lt;/PRE&gt;
&lt;P&gt;Result&lt;/P&gt;
&lt;PRE&gt;                                                      sum_total_
Category  Ref_type      nobs  n_with_ids  n_people           amt
----------------------------------------------------------------
Black     3                1           1         1           500
Blue      2                2           2         2           500
Green     1                1           1         1            20
Green     2                1           1         1           350
Red       3                1           1         1           100
&lt;/PRE&gt;
&lt;P&gt;But if you include a variable that is neither one of the grouping variables nor an aggregate result then SAS will re-merge the computed aggregate values back with all of the observations selected in that group.&amp;nbsp; It will tell you this in the LOG.&lt;/P&gt;
&lt;PRE&gt;1178  proc sql ;
1179   select category,ref_type,total_amt
1180        , count(*) as nobs
1181        , count(person_id) as n_with_ids
1182        , count(person_id) as n_people
1183        , sum(total_amt) as sum_total_amt
1184     from have
1185     group by category,ref_type
1186   ;
NOTE: The query requires remerging summary statistics back with the original data.
1187  quit;
&lt;/PRE&gt;
&lt;P&gt;Result&lt;/P&gt;
&lt;PRE&gt;                                                                 sum_total_
Category  Ref_type  Total_amt      nobs  n_with_ids  n_people           amt
---------------------------------------------------------------------------
Black     3               500         1           1         1           500
Blue      2               200         2           2         2           500
Blue      2               300         2           2         2           500
Green     1                20         1           1         1            20
Green     2               350         1           1         1           350
Red       3               100         1           1         1           100
&lt;/PRE&gt;</description>
      <pubDate>Thu, 10 Aug 2023 15:47:43 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/What-does-quot-unique-quot-do-here-for-the-count-function/m-p/888810#M351123</guid>
      <dc:creator>Tom</dc:creator>
      <dc:date>2023-08-10T15:47:43Z</dc:date>
    </item>
    <item>
      <title>Re: What does "unique" do here for the count function alongside group by statement?</title>
      <link>https://communities.sas.com/t5/SAS-Programming/What-does-quot-unique-quot-do-here-for-the-count-function/m-p/888811#M351124</link>
      <description>Thanks! That's really helpful. I think now I'm just not understanding what the difference between using&lt;BR /&gt;count(*), count(unique person_id) and count(person_id) is when using that "group by category, ref_type". It seems like it will always give the same result (but I know that count(unique person_id) and count(person_id) would give different results in general, I just can't see why, since we are using the group by statement).</description>
      <pubDate>Thu, 10 Aug 2023 15:55:37 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/What-does-quot-unique-quot-do-here-for-the-count-function/m-p/888811#M351124</guid>
      <dc:creator>newbatprocsql1</dc:creator>
      <dc:date>2023-08-10T15:55:37Z</dc:date>
    </item>
    <item>
      <title>Re: What does "unique" do here for the count function alongside group by statement?</title>
      <link>https://communities.sas.com/t5/SAS-Programming/What-does-quot-unique-quot-do-here-for-the-count-function/m-p/888812#M351125</link>
      <description>I say that will give different results in general, because I remember taking away the "unique" keyword and it gave different results. Also, did you mean to type count(person_id) twice in the proc sql statements?</description>
      <pubDate>Thu, 10 Aug 2023 15:56:57 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/What-does-quot-unique-quot-do-here-for-the-count-function/m-p/888812#M351125</guid>
      <dc:creator>newbatprocsql1</dc:creator>
      <dc:date>2023-08-10T15:56:57Z</dc:date>
    </item>
    <item>
      <title>Re: What does "unique" do here for the count function alongside group by statement?</title>
      <link>https://communities.sas.com/t5/SAS-Programming/What-does-quot-unique-quot-do-here-for-the-count-function/m-p/888813#M351126</link>
      <description>&lt;P&gt;No. Second one should have the DISTINCT keyword.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;They will yield the same count when the three variable combination of PERSON_ID, CATEGORY, and REF_TYPE uniquely identify the observation.&amp;nbsp;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Thu, 10 Aug 2023 16:01:41 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/What-does-quot-unique-quot-do-here-for-the-count-function/m-p/888813#M351126</guid>
      <dc:creator>Tom</dc:creator>
      <dc:date>2023-08-10T16:01:41Z</dc:date>
    </item>
  </channel>
</rss>

