PROC SQL question - Count Distinct, Sum, Group by problems

Reply
New Contributor
Posts: 2

PROC SQL question - Count Distinct, Sum, Group by problems

Hi,

 

Hopefully an easy question to solve, but I'm back to using Base SAS after 8 years of using SAS Enterprise Guide and I'm getting a mental block!!

 

Below is my proc sql code

 

proc sql;
create table mktg_count
as
select PROD_DETAIL,
count(distinct DM_CUST_KEY) as Total_Customers,
sum( case when valid_email = 'Y' and SUPRS_EMAIL_FLAG = 'N' and OPT_OUT_FLAG = 'N' then 1 else 0 end) as Emailable,
sum( case when SUPRS_MAIL_FLAG = 'N' and OPT_OUT_FLAG = 'N' then 1 else 0 end) as Mailable
from cust_emails
where PROD_DETAIL not in ('Home','Motor')
group by PROD_DETAIL;
quit;

 

What I'm trying to do is create the Emailable and Mailable columns for the distinct values of Total Customers, but my sum function is bringing back the sum for the whole of my table instead.

 

Can anybody please help me with this so that the sum function is only on distinct customers.  I've tried putting the calculated columns in the group by statement, but that doesn't seem to work.

 

Thanks,

Dan

 

 

Super User
Posts: 7,446

Re: PROC SQL question - Count Distinct, Sum, Group by problems

If you provide some example data to play around with (in a data step), I could try to write some code for this.

Basically I would sum up first on DM_CUST_KEY (I guess that if a DM_CUST_KEY has at least one valid_email = 'Y', you want to count that as 1 for that particular DM_CUST_KEY), and then sum on the PROD_DETAIL in a second step. But to get a feel for your data, I'd need that first.

---------------------------------------------------------------------------------------------
Maxims of Maximally Efficient SAS Programmers
Super User
Super User
Posts: 7,720

Re: PROC SQL question - Count Distinct, Sum, Group by problems

Can you supply some test data - in the form of a datastep - and what the output should look like.  It is confusing to me that you have count(distinct DM_CUST_KEY), but want this to be unique keys?  Its either a count of that, or a unique list of but not both.  The where should be moved to a subquery or having clause.  And its likely a simple datastep or proc would do this better anyways.  But without some data its hard to say.

Super User
Posts: 5,388

Re: PROC SQL question - Count Distinct, Sum, Group by problems

Not sure what you mean by unique customer in the sum functions. You have a nested case using three columns. How would you expect that to be evaluated across multiple rows for a single customer?

Perhaps it's easier to do a preparation step to first give you unique rows for a customer, then do the count?

Data never sleeps
New Contributor
Posts: 2

Re: PROC SQL question - Count Distinct, Sum, Group by problems

Hi,

 

I've solved this for myself now.  I had an extra proc sort de-dupe missing originally so I've added that in now and got the results I wanted.  Schoolboy error :-).

 

Thanks for all your help and comments, though.

 

Dan

 

Ask a Question
Discussion stats
  • 4 replies
  • 435 views
  • 0 likes
  • 4 in conversation