Hello friends,
I have the following table sorted by Customer_ID and Transaction_Date:
Customer_ID, Transaction_Date, Store_Region, Amount
I would like to calculate running total such that each time a Customer or a Store_Region change, it starts over again.
Tried several solutions but all require sorting by Store_Region - this is not my case.
It is important that the data stays sorted by Customer_ID and Transation_Date.
Thank you
Hello @shparber_m and welcome to the SAS Support Communities!
Have you tried using the LAG function?
data want;
set have;
by Customer_ID Transaction_Date;
if first.Customer_ID or Store_Region ne lag(Store_Region) then rtotal=Amount;
else rtotal+Amount;
run;
Or, alternatively, the NOTSORTED option of the BY statement?
data want;
set have;
by Customer_ID Store_Region notsorted;
if first.Store_Region then rtotal=Amount;
else rtotal+Amount;
run;
Sort by customer_id, store_region and transaction_date into a temporary dataset; calculate the running totals; then sort the result by customer_id and transaction_date.
But I do not want to sort by Store_Region!
Could be a situation where a customer buys in NJ then in NY and the in NJ again.
I need each purchase to reset my running total in this case.
If I order by Store_Region as you propose, then I'll have NJ, NJ, NY and it will add two NJ into the second running total.
I dont want that. I want to order only by Customer_ID and Transaction_Date.
Please help.
Thank you!
It may help to provide an example of what you have with several different customer_id, transaction_date, store_region and amount. Then show what you expect the result to be for that small example (small enough to do by hand).
Sure! Here is a sample:
Each time that either customer_id or store_region change - I need to reset the running total.
The sort has to remain by Customer_ID and Transaction_Date only.
Please help.
Thank you!
Customer_ID | Transaction_Date | Store_Region | Amount | Running_Total |
A | 2021-07-01 | NJ | 100 | 100 |
A | 2021-07-02 | NJ | 100 | 200 |
A | 2021-07-03 | TX | 100 | 100 |
A | 2021-07-04 | NJ | 100 | 100 |
A | 2021-07-05 | TX | 100 | 100 |
A | 2021-07-06 | TX | 100 | 200 |
B | 2021-07-01 | TX | 10 | 10 |
B | 2021-07-02 | TX | 10 | 20 |
B | 2021-07-03 | NJ | 10 | 10 |
B | 2021-07-04 | TX | 10 | 10 |
B | 2021-07-05 | NJ | 10 | 10 |
B | 2021-07-06 | NJ | 10 | 20 |
Hello @shparber_m and welcome to the SAS Support Communities!
Have you tried using the LAG function?
data want;
set have;
by Customer_ID Transaction_Date;
if first.Customer_ID or Store_Region ne lag(Store_Region) then rtotal=Amount;
else rtotal+Amount;
run;
Or, alternatively, the NOTSORTED option of the BY statement?
data want;
set have;
by Customer_ID Store_Region notsorted;
if first.Store_Region then rtotal=Amount;
else rtotal+Amount;
run;
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.