DATA TMP; SET NOV_TRX; BY PK_CONSUMER_ID; IF FIRST.PK_CONSUMER_ID THEN TOTAL_VOLUME = 0; TOTAL_VOLUME + SALES_QUANTITY; IF LAST.PK_CONSUMER_ID THEN OUTPUT; RUN;
Hello everyone,
I am trying to group a piece of data set (please see the attachement that representes a sample of my data). The code I wrote so far provides me with TOTAL_VOLUME per consumer, however, I would also need the volume per customer and VP_MG as well as volume per customer and Diesel_Petrol but do not know how to amend it code so that it worked. Thanks for suggestions.
DO NOT post SAS data in an Excel file. An Excel spreadsheet contains no metadata information (variable length, assigned formats etc) which is crucial for understanding your issues.
Instead use the macro provided in https://communities.sas.com/t5/SAS-Communities-Library/How-to-create-a-data-step-version-of-your-dat... to convert your dataset into a data step that can be posted here in a code window. This allows to exactly recreate your dataset with copy/paste and run, and since only text is involved, firewalls won't block this. (I can't download Excel files from the internet where my SAS is available, because the company firewall blocks them for security reasons; the same is true for many contributors here)
Since you have two different subgroups, you will need two steps if you use data step logic:
data
sum_total (keep=pk_customer_id total_volume)
sum_vp_mg (keep=pk_customer_id vp_mg vp_mg_volume)
;
set nov_trx;
by pk_customer_id vp_mg;
if first.pk_customer_id then total_volume = 0;
if first.vp_mg then vp_mg_volume = 0;
total_volume + sales_quantity;
vp_mg_volume + sales_quantity;
if last.vp_mg then output sum_vp_mg;
if last.pk_customer_id then output sum_total;
run;
data sum_diesel_petrol (keep=pk_customer_id diesel_petrol dp_volume);
set nov_trx;
by pk_customer_id diesel_petrol;
if first.diesel_petrol then dp_volume = 0;
dp_volume + sales_quantity;
if last.diesel_petrol then output;
run;
Look at PROC MEANS, specifically look at the CLASS, WAYS, TYPES statements.
Thanks for your suggestion, I might take a look at it later. I would like to manage it just with BY statement right now which I believe should be very easy.
DO NOT post SAS data in an Excel file. An Excel spreadsheet contains no metadata information (variable length, assigned formats etc) which is crucial for understanding your issues.
Instead use the macro provided in https://communities.sas.com/t5/SAS-Communities-Library/How-to-create-a-data-step-version-of-your-dat... to convert your dataset into a data step that can be posted here in a code window. This allows to exactly recreate your dataset with copy/paste and run, and since only text is involved, firewalls won't block this. (I can't download Excel files from the internet where my SAS is available, because the company firewall blocks them for security reasons; the same is true for many contributors here)
Since you have two different subgroups, you will need two steps if you use data step logic:
data
sum_total (keep=pk_customer_id total_volume)
sum_vp_mg (keep=pk_customer_id vp_mg vp_mg_volume)
;
set nov_trx;
by pk_customer_id vp_mg;
if first.pk_customer_id then total_volume = 0;
if first.vp_mg then vp_mg_volume = 0;
total_volume + sales_quantity;
vp_mg_volume + sales_quantity;
if last.vp_mg then output sum_vp_mg;
if last.pk_customer_id then output sum_total;
run;
data sum_diesel_petrol (keep=pk_customer_id diesel_petrol dp_volume);
set nov_trx;
by pk_customer_id diesel_petrol;
if first.diesel_petrol then dp_volume = 0;
dp_volume + sales_quantity;
if last.diesel_petrol then output;
run;
Thanks Kurt for your recommendation. The code worked fine.
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.