Hi SAS Community,
I had a question for you- I tried to look through to forums to see if something similar has been answered before but couldn't find anything- I apologize if it has! I'm completely new to SAS and have just been figuring things out as I go along and have run into a bit of a road-bump. I've posted the details below.
So I have a dataset that looks like the following:
Customer ID | Year | Supplier |
---|---|---|
12014 | 2001 | 88888 |
12014 | 2001 | 88888 |
22222 | 2002 | 88888 |
11111 | 2001 | 99999 |
41243 | 2003 | 99999 |
11111 | 2004 | 99999 |
41531 | 2004 | 88888 |
What I want to do is sum the number of unique customers that each supplier has by year, to get a database that looks like the following:
Supplier | Year | # of Unique Customers |
---|---|---|
88888 | 2001 | 1 |
88888 | 2002 | 1 |
88888 | 2003 | 0 |
88888 | 2004 | 1 |
99999 | 2001 | 1 |
99999 | 2002 | 0 |
99999 | 2003 | 1 |
99999 | 2004 | 1 |
If a customer comes one year and again the next, they would count as unique. If they make multiple purchases in one year, they are not considered unique. I have really no idea how to get from the data I have to the data I want. Any advice would be much appreciated! Thank you.
You don't say if you want a dataset or output. This will create output, prefix with a "create table <name> as" before the first select. "Have" represents the name of your current dataset.
proc sql;
select supplier, year, count(*) as customers
from (select distinct supplier,year, customerid from have)
group by supplier, year;
quit;
You don't say if you want a dataset or output. This will create output, prefix with a "create table <name> as" before the first select. "Have" represents the name of your current dataset.
proc sql;
select supplier, year, count(*) as customers
from (select distinct supplier,year, customerid from have)
group by supplier, year;
quit;
data have;
input CustomerID Year Supplier;
datalines;
12014 2001 88888
12014 2001 88888
22222 2002 88888
11111 2001 99999
41243 2003 99999
11111 2004 99999
41531 2004 88888
;
data base;
do supplier=88888 to 99999 by 11111;
do year=2001 to 2004;
output;
end;
end;
run;
proc sql;
create table want as
select a.supplier,a.year,count(distinct b.customerid) as unique_customer from base a
left join have b
on a.supplier=b.supplier
and a.year=b.year
group by a.supplier,a.year;
quit;
The only problem for solving this with SQL is getting zeros for suppliers/years without data. It can be done this way:
proc sql;
create table want as
select s.supplier, y.year, coalesce(c.nbCust, 0) as nbCust
from
(select distinct supplier from have) as s cross join
(select distinct year from have) as y left join
(select supplier, year, count(distinct customerid) as nbCust
from have
group by supplier, year) as c
on s.supplier=c.supplier and y.year=c.year;
select * from want;
quit;
PG
Ok, the more, the merrier. Freshly learned from ,
data have;
input Customer_ID Year Supplier;
cards;
12014 2001 88888
12014 2001 88888
22222 2002 88888
11111 2001 99999
41243 2003 99999
11111 2004 99999
41531 2004 88888
;
proc sql;
create table want as
select a.supplier, a.year, count(distinct b.customer_id) as ct from
(select distinct * from have (keep=supplier) , have (keep=year) ) a
natural left join have b
group by a.supplier,a.year
order by a.supplier,a.year
;
quit;
Haikuo
Good Point, PG!
Update: if anyone is interested, PG is saying the above statement is better off replace by the following:
Select * from (select distinct supplier from have), (select distinct year from have)
I've always been a fan of the double proc freq, just because its easy to understand
proc freq data=have noprint;
table supplier*year*customer_id/out=temp;
run;
proc freq data=temp noprint;
table supplier*year/out=want sparse;
run;
Nice!
Thanks for sharing, Reeza!
Thank you all so much- all your input has been incredibly helpful and I greatly appreciate it!
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.
Find more tutorials on the SAS Users YouTube channel.