Hi,
I have a sample dataset follows:
********************************************
data sample;
infile datalines delimiter=',';
input Email $ Product_Type Quantity;
datalines;
E1,1,2
E1,1,2
E1,1,2
E1,1,2
E1,1,2
E1,3,5
E2,1,1
E2,1,1
E2,1,1
E2,1,1
E2,2,3
;
run;
********************************************
I divided users into different groups based of number of orders they made:
********************************************
proc format;
value Orders
1 = '1 order'
2 = '2 orders'
3 - high = '3+ orders'
. = 'No purchase';
run;
********************************************
Then I used proc sql to first group data by email and product type, then put them in a subquery to group it again based on product_type and formatted range.
********************************************
proc sql;
select product_type, put(number_of_orders, Orders.) as number_of_orders, count(*) as number_of_email_accounts
from (select email, product_type, count(*) as number_of_orders
from sample
group by email, product_type) a
group by number_of_orders, a.product_type;
quit;
********************************************
Both E1 and E2 have more than 3 orders for Product_Type 1. Output of subquery:
Product_Type | number_of_orders | |
E1 | 1 | 5 |
E1 | 3 | 1 |
E2 | 1 | 4 |
E2 | 2 | 1 |
Final Output:
********************************************
Product_Type | number_of_orders | number_of_email_accounts |
1 | 3+ orders | 1 |
1 | 3+ orders | 1 |
2 | 1 order | 1 |
3 | 1 order | 1 |
********************************************
It does give the correct group name, but why the group by clause is not working? Can I get a result like
Product_Type | number_of_orders | number_of_email_accounts |
1 | 3+ orders | 2 |
2 | 1 order | 1 |
3 | 1 order | 1 |
Hi Thank you @qqian Much better. Here you go, the following should meet your need
data sample;
infile datalines delimiter=',';
input Email $ Product_Type Quantity;
datalines;
E1,1,2
E1,1,2
E1,1,2
E1,1,2
E1,1,2
E1,3,5
E2,1,1
E2,1,1
E2,1,1
E2,1,1
E2,2,3
;
run;
proc format;
value Orders
1 = '1 order'
2 = '2 orders'
3 - high = 'More than 3 orders'
. = 'No purchase';
run;
proc sql;
create table want as
select product_type,number_of_orders,count(*) as number_of_email_accounts
from
(select email, product_type, put(count(*), Orders.) as number_of_orders
from sample
group by email, product_type)
group by number_of_orders,product_type
order by product_type;
quit;
EDITED: To include formatted value
The data set you show is NOT the dataset used in the sub-query
(select email, product_type, count(*) as number_of_orders
from db1
group by email, product_type)
since you don't show a variable named email. Thus, we can't replicate the result you are getting or explain why it is happening.
So please, give us (a portion of) the actual data used by this code in the data set DB1, and not some "close facsimile" of the data set. Please use these instructions to provide the data set as a SAS data step. DO NOT USE ANY OTHER METHOD TO PROVIDE THE DATA.
Also, I don't see any reason for you to turn the variable QUANTITY to character, unnecessary effort to turn a numeric variable into character here.
Hi Paige,
Sorry for the confusing information. I modified my message and provided the dataset I created, the code and also the output.
Thanks,
Quan
Hi @qqian Breaking your SQL into steps makes me wonder how do you expect what you are looking for
data have;
input Email $ Product_Type Quantity;
cards;
E1 1 2
E2 1 1
E2 2 3
E3 4 24
E4 3 5
E5 2 6
E5 2 8
E5 3 2
;
proc format;
value Orders
1 = '1 order'
2 = '2 orders'
3 - high = 'More than 3 orders'
. = 'No purchase';
run;
proc sql;
create table temp as
select email, product_type, count(*) as number_of_orders
from have
group by email, product_type;
quit;
proc sql;
create table w as
select *, put(number_of_orders, Orders.) as number_of_orders1
from temp
group by number_of_orders1, product_type;
quit;
Please take a look at Temp and W, and clarify
Hi @novinosrin,
Thanks for your response!! I modified my message above and provided the output I got and the ideal output I want to have.
Thanks,
Quan
Hi Thank you @qqian Much better. Here you go, the following should meet your need
data sample;
infile datalines delimiter=',';
input Email $ Product_Type Quantity;
datalines;
E1,1,2
E1,1,2
E1,1,2
E1,1,2
E1,1,2
E1,3,5
E2,1,1
E2,1,1
E2,1,1
E2,1,1
E2,2,3
;
run;
proc format;
value Orders
1 = '1 order'
2 = '2 orders'
3 - high = 'More than 3 orders'
. = 'No purchase';
run;
proc sql;
create table want as
select product_type,number_of_orders,count(*) as number_of_email_accounts
from
(select email, product_type, put(count(*), Orders.) as number_of_orders
from sample
group by email, product_type)
group by number_of_orders,product_type
order by product_type;
quit;
EDITED: To include formatted value
Hi @novinosrin,
I'd like to add number_of_orders in the want table, then it works! I don't know it should be solved by creating a new table.
proc sql;
create table want as
select product_type, number_of_orders, count(*) as number_of_email_accounts
from
(select email, product_type, put(count(*), Orders.) as number_of_orders
from sample
group by email, product_type)
group by number_of_orders,product_type;
quit;
Thank you so much!!
@qqian No need to create a new table WANT. That has become a habit for us in the coommunity to use HAVE and WANT everywhere lol
Feel free to remove the create table statement as i tested below
proc sql;
select product_type,number_of_orders,count(*) as number_of_email_accounts
from
(select email, product_type, put(count(*), Orders.) as number_of_orders
from sample
group by email, product_type)
group by number_of_orders,product_type
order by product_type;
quit;
Product_Type | number_of_orders | number_of_email_accounts |
---|---|---|
1 | More than 3 orders | 2 |
2 | 1 order | 1 |
3 | 1 order | 1 |
gotcha, so the key is to move 'put' in the subquery. Learned a lot.!
Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!
SAS' Charu Shankar shares her PROC SQL expertise by showing you how to master the WHERE clause using real winter weather data.
Find more tutorials on the SAS Users YouTube channel.