- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
This is a pretty basic question, but I wanted some confirmation since I haven't been able to find a clear answer.
I have a database of users of a certain services we provide in a given year. Sometimes these users return for the same service or other services. I want to find how many distinct users I have of each service so I constructed this code:
proc sql;
create table SERVICES as
select
service,
count(distinct user_id) as Num_Users
from users
where year = 2017
group by service;
quit;
I want to know if this is counting unique users of each service or unique users from the total database. For example, if person A used Service1 and Service2 are they being counted in both service groupings or are they being counted as 1 unique user? Is this the correct interpretation of this code?
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
You wouldn't want to use a where clause with a group by. In SQL you would use a having clause with a group by:
group by service having year=2017;
Or you could sub-query:
from (select * from users where year=2017) group by service;
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
A very small data set to test with such as:
data users; input user_id Service; year=2017; datalines; 1 1 1 2 ; run;
should fairly easily answer this.
Note that the output of the Proc SQL for that example is 2 records with a count of one each. Since there are only two records input ...
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
I pretty much use HAVING only for filtering on aggregations.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
I can't remember why exactly, must have had issues with it at some point in the past so now I totally avoid using where clauses on group by's and put them in subqueries or havings. It may just be habit from my side. Definately can't use use where for aggregates.