BookmarkSubscribeRSS Feed
KellyJade
Calcite | Level 5

This is a pretty basic question, but I wanted some confirmation since I haven't been able to find a clear answer. 

 

I have a database of users of a certain services we provide in a given year. Sometimes these users return for the same service or other services. I want to find how many distinct users I have of each service so I constructed this code:

 

proc sql;
create table SERVICES as
select
service,
count(distinct user_id) as Num_Users
from users
where year = 2017
group by service;
quit;

 

I want to know if this is counting unique users of each service or unique users from the total database. For example, if person A used Service1 and Service2 are they being counted in both service groupings or are they being counted as 1 unique user? Is this the correct interpretation of this code? 

4 REPLIES 4
RW9
Diamond | Level 26 RW9
Diamond | Level 26

You wouldn't want to use a where clause with a group by.  In SQL you would use a having clause with a group by:

group by service having year=2017;

Or you could sub-query:

from (select * from users where year=2017)
group by service;
ballardw
Super User

A very small data set to test with such as:

data users;
   input user_id Service;
   year=2017;
datalines;
1  1
1  2
;
run;

should fairly easily answer this.

 

Note that the output of the Proc SQL for that example is 2 records with a count of one each. Since there are only two records input ...

LinusH
Tourmaline | Level 20
Sorry @RW9 I'm not sure I'm following you, why can't I filter the source data with a WHERE clause like this? Beside the logical part of it, HAVING will never use indexes for optimization.
I pretty much use HAVING only for filtering on aggregations.
Data never sleeps
RW9
Diamond | Level 26 RW9
Diamond | Level 26

I can't remember why exactly, must have had issues with it at some point in the past so now I totally avoid using where clauses on group by's and put them in subqueries or havings.  It may just be habit from my side.  Definately can't use use where for aggregates.

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

What is Bayesian Analysis?

Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 4 replies
  • 1804 views
  • 0 likes
  • 4 in conversation