BookmarkSubscribeRSS Feed
KellyJade
Calcite | Level 5

This is a pretty basic question, but I wanted some confirmation since I haven't been able to find a clear answer. 

 

I have a database of users of a certain services we provide in a given year. Sometimes these users return for the same service or other services. I want to find how many distinct users I have of each service so I constructed this code:

 

proc sql;
create table SERVICES as
select
service,
count(distinct user_id) as Num_Users
from users
where year = 2017
group by service;
quit;

 

I want to know if this is counting unique users of each service or unique users from the total database. For example, if person A used Service1 and Service2 are they being counted in both service groupings or are they being counted as 1 unique user? Is this the correct interpretation of this code? 

4 REPLIES 4
RW9
Diamond | Level 26 RW9
Diamond | Level 26

You wouldn't want to use a where clause with a group by.  In SQL you would use a having clause with a group by:

group by service having year=2017;

Or you could sub-query:

from (select * from users where year=2017)
group by service;
ballardw
Super User

A very small data set to test with such as:

data users;
   input user_id Service;
   year=2017;
datalines;
1  1
1  2
;
run;

should fairly easily answer this.

 

Note that the output of the Proc SQL for that example is 2 records with a count of one each. Since there are only two records input ...

LinusH
Tourmaline | Level 20
Sorry @RW9 I'm not sure I'm following you, why can't I filter the source data with a WHERE clause like this? Beside the logical part of it, HAVING will never use indexes for optimization.
I pretty much use HAVING only for filtering on aggregations.
Data never sleeps
RW9
Diamond | Level 26 RW9
Diamond | Level 26

I can't remember why exactly, must have had issues with it at some point in the past so now I totally avoid using where clauses on group by's and put them in subqueries or havings.  It may just be habit from my side.  Definately can't use use where for aggregates.

SAS Innovate 2025: Register Now

Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!

What is Bayesian Analysis?

Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 4 replies
  • 2217 views
  • 0 likes
  • 4 in conversation