DATA Step, Macro, Functions and more

Aggregation and Group by processing

Reply
Frequent Contributor
Posts: 99

Aggregation and Group by processing

 

Hi all,

I’m having trouble figuring this out. Any help is greatly appreciated

 

I have a data set which contains customer sales data by store and online channels.  I need to take this data set and calculate: 

  1. The total sales by channel
  2. The percent of total sales by channel
  3. The total sales by product type (store and online together)
  4. The percent of total sales by product type (store and Online together)

Here is the table I start with: 

 

CHANNEL_UNIT CUSTOMER_ID prod_type prod_sales
1 17131 STORE_HOME 0
1 17131 STORE_CORE_APP_M 526.8
1 17131 STORE_CHILDREN 0
1 17131 STORE_CORE_APP_W 0
3 17131 ONLINE_HOME 0
3 17131 ONLINE_CHILDREN 196.99
3 17131 ONLINE_CORE_APP_M 119.96
3 17131 ONLINE_CORE_APP_W 0
1 4687704 STORE_HOME 67.5
1 4687704 STORE_CORE_APP_M 0
1 4687704 STORE_CHILDREN 1518
1 4687704 STORE_CORE_APP_W 0
3 4687704 ONLINE_HOME 0
3 4687704 ONLINE_CHILDREN 120
3 4687704 ONLINE_CORE_APP_M 223
3 4687704 ONLINE_CORE_APP_W 0

 

This is the output table I'm trying to achieve: 

 

CUSTOMER_ID Total_Sales Store_Sales Online_Sales Percent_Store_Sales Percent_Online_Sales TTL_HOME TTL_CORE_APP_M TTL_CORE_APP_W TTL_STORE_CHILDREN Per_HOME Per_CORE_APP_M Per_CORE_APP_W Per_STORE_CHILDREN
17131 $843.75 $526.80 $316.95 62.4% 60.2% $0.00 $646.76 $0.00 $196.99 0.0% 76.7% 0.0% 23.3%
4687704 $1,928.50 $1,585.50 $343.00 82.2% 21.6% $67.50 $223.00 $0.00 $1,638.00 3.5% 11.6% 0.0% 84.9%
Super User
Posts: 5,256

Re: Aggregation and Group by processing

Take a look at PROC TABULATE or PROC REPORT.

Data never sleeps
Frequent Contributor
Posts: 99

Re: Aggregation and Group by processing

Hi, 

 

I need an actual table, not a report output. 

Super User
Posts: 17,823

Re: Aggregation and Group by processing

Proc tabulate does generate an OUTPUT dataset that can be manipulated into that format.

 

Another option is combining proc freqs or proc sqls.  Because you're combining at different levels, its probably easiest to do multiple tables and combine them for your final results unless you're super confident in your coding skills.

Super User
Posts: 10,500

Re: Aggregation and Group by processing


RobertNYC wrote:

Hi, 

 

I need an actual table, not a report output. 


If you need a data set then the question becomes what will you do next with the result? In a large number of cases making data wide like that increases coding complexity.

Frequent Contributor
Posts: 99

Re: Aggregation and Group by processing

Hi 

 

estructure the data is this way is because I am building a logistic model and I need to have the input variables transformed appropriately. Thanks.     

Super User
Posts: 9,681

Re: Aggregation and Group by processing

It is easy for SQL.



data have;
infile cards truncover expandtabs;
input CHANNEL_UNIT	CUSTOMER_ID	prod_type : $40.	prod_sales;
flag=scan(prod_type,1,'_');
cards;
1	17131	STORE_HOME	0
1	17131	STORE_CORE_APP_M	526.8
1	17131	STORE_CHILDREN	0
1	17131	STORE_CORE_APP_W	0
3	17131	ONLINE_HOME	0
3	17131	ONLINE_CHILDREN	196.99
3	17131	ONLINE_CORE_APP_M	119.96
3	17131	ONLINE_CORE_APP_W	0
1	4687704	STORE_HOME	67.5
1	4687704	STORE_CORE_APP_M	0
1	4687704	STORE_CHILDREN	1518
1	4687704	STORE_CORE_APP_W	0
3	4687704	ONLINE_HOME	0
3	4687704	ONLINE_CHILDREN	120
3	4687704	ONLINE_CORE_APP_M	223
3	4687704	ONLINE_CORE_APP_W	0
;
run;
proc sql;
create table want as
 select customer_id,
  sum(prod_sales) as total_sales,
  (select sum(prod_sales) from have 
   where customer_id=a.customer_id and flag='STORE') as store_sales,
   
  calculated total_sales/(select sum(prod_sales) from have) as percent_total,
  calculated store_sales/(select sum(prod_sales) from have where flag='STORE') 
  as percent_store
 
  from have as a
   group by customer_id;
 quit;





Ask a Question
Discussion stats
  • 6 replies
  • 250 views
  • 2 likes
  • 5 in conversation