Grouping data with BY statement

Accepted Solution Solved
Reply
Frequent Contributor
Posts: 84
Accepted Solution

Grouping data with BY statement

DATA TMP;
SET NOV_TRX;
BY PK_CONSUMER_ID;
IF FIRST.PK_CONSUMER_ID THEN TOTAL_VOLUME = 0;
TOTAL_VOLUME + SALES_QUANTITY;
IF LAST.PK_CONSUMER_ID THEN OUTPUT;
RUN;

Hello everyone,

 

 

I am trying to group a piece of data set (please see the attachement that representes a sample of my data). The code I wrote so far provides me with TOTAL_VOLUME per consumer, however, I would also need the volume per customer and VP_MG as well as volume per customer and Diesel_Petrol but do not know how to amend it code so that it worked. Thanks for suggestions.


Accepted Solutions
Solution
‎04-20-2017 07:45 AM
Super User
Posts: 7,762

Re: Grouping data with BY statement

Posted in reply to Uknown_user

DO NOT post SAS data in an Excel file. An Excel spreadsheet contains no metadata information (variable length, assigned formats etc) which is crucial for understanding your issues.

Instead use the macro provided in https://communities.sas.com/t5/SAS-Communities-Library/How-to-create-a-data-step-version-of-your-dat... to convert your dataset into a data step that can be posted here in a code window. This allows to exactly recreate your dataset with copy/paste and run, and since only text is involved, firewalls won't block this. (I can't download Excel files from the internet where my SAS is available, because the company firewall blocks them for security reasons; the same is true for many contributors here)

Since you have two different subgroups, you will need two steps if you use data step logic:

data
  sum_total (keep=pk_customer_id total_volume)
  sum_vp_mg (keep=pk_customer_id vp_mg vp_mg_volume)
;
set nov_trx;
by pk_customer_id vp_mg;
if first.pk_customer_id then total_volume = 0;
if first.vp_mg then vp_mg_volume = 0;
total_volume + sales_quantity;
vp_mg_volume + sales_quantity;
if last.vp_mg then output sum_vp_mg;
if last.pk_customer_id then output sum_total;
run;

data sum_diesel_petrol (keep=pk_customer_id diesel_petrol dp_volume);
set nov_trx;
by pk_customer_id diesel_petrol;
if first.diesel_petrol then dp_volume = 0;
dp_volume + sales_quantity;
if last.diesel_petrol then output;
run;
---------------------------------------------------------------------------------------------
Maxims of Maximally Efficient SAS Programmers

View solution in original post


All Replies
Super User
Posts: 19,770

Re: Grouping data with BY statement

Posted in reply to Uknown_user

Look at PROC MEANS, specifically look at the CLASS, WAYS, TYPES statements. 

Frequent Contributor
Posts: 84

Re: Grouping data with BY statement

Thanks for your suggestion, I might take a look at it later. I would like to manage it just with BY statement right now which I believe should be very easy.

Solution
‎04-20-2017 07:45 AM
Super User
Posts: 7,762

Re: Grouping data with BY statement

Posted in reply to Uknown_user

DO NOT post SAS data in an Excel file. An Excel spreadsheet contains no metadata information (variable length, assigned formats etc) which is crucial for understanding your issues.

Instead use the macro provided in https://communities.sas.com/t5/SAS-Communities-Library/How-to-create-a-data-step-version-of-your-dat... to convert your dataset into a data step that can be posted here in a code window. This allows to exactly recreate your dataset with copy/paste and run, and since only text is involved, firewalls won't block this. (I can't download Excel files from the internet where my SAS is available, because the company firewall blocks them for security reasons; the same is true for many contributors here)

Since you have two different subgroups, you will need two steps if you use data step logic:

data
  sum_total (keep=pk_customer_id total_volume)
  sum_vp_mg (keep=pk_customer_id vp_mg vp_mg_volume)
;
set nov_trx;
by pk_customer_id vp_mg;
if first.pk_customer_id then total_volume = 0;
if first.vp_mg then vp_mg_volume = 0;
total_volume + sales_quantity;
vp_mg_volume + sales_quantity;
if last.vp_mg then output sum_vp_mg;
if last.pk_customer_id then output sum_total;
run;

data sum_diesel_petrol (keep=pk_customer_id diesel_petrol dp_volume);
set nov_trx;
by pk_customer_id diesel_petrol;
if first.diesel_petrol then dp_volume = 0;
dp_volume + sales_quantity;
if last.diesel_petrol then output;
run;
---------------------------------------------------------------------------------------------
Maxims of Maximally Efficient SAS Programmers
Frequent Contributor
Posts: 84

Re: Grouping data with BY statement

Posted in reply to KurtBremser

Thanks Kurt for your recommendation. The code worked fine.

☑ This topic is solved.

Need further help from the community? Please ask a new question.

Discussion stats
  • 4 replies
  • 121 views
  • 1 like
  • 3 in conversation