Hello friends,
I have the following table sorted by Customer_ID and Transaction_Date:
Customer_ID, Transaction_Date, Store_Region, Amount
I would like to calculate running total such that each time a Customer or a Store_Region change, it starts over again.
Tried several solutions but all require sorting by Store_Region - this is not my case.
It is important that the data stays sorted by Customer_ID and Transation_Date.
Thank you
Hello @shparber_m and welcome to the SAS Support Communities!
Have you tried using the LAG function?
data want;
set have;
by Customer_ID Transaction_Date;
if first.Customer_ID or Store_Region ne lag(Store_Region) then rtotal=Amount;
else rtotal+Amount;
run;
Or, alternatively, the NOTSORTED option of the BY statement?
data want;
set have;
by Customer_ID Store_Region notsorted;
if first.Store_Region then rtotal=Amount;
else rtotal+Amount;
run;
Sort by customer_id, store_region and transaction_date into a temporary dataset; calculate the running totals; then sort the result by customer_id and transaction_date.
But I do not want to sort by Store_Region!
Could be a situation where a customer buys in NJ then in NY and the in NJ again.
I need each purchase to reset my running total in this case.
If I order by Store_Region as you propose, then I'll have NJ, NJ, NY and it will add two NJ into the second running total.
I dont want that. I want to order only by Customer_ID and Transaction_Date.
Please help.
Thank you!
It may help to provide an example of what you have with several different customer_id, transaction_date, store_region and amount. Then show what you expect the result to be for that small example (small enough to do by hand).
Sure! Here is a sample:
Each time that either customer_id or store_region change - I need to reset the running total.
The sort has to remain by Customer_ID and Transaction_Date only.
Please help.
Thank you!
Customer_ID | Transaction_Date | Store_Region | Amount | Running_Total |
A | 2021-07-01 | NJ | 100 | 100 |
A | 2021-07-02 | NJ | 100 | 200 |
A | 2021-07-03 | TX | 100 | 100 |
A | 2021-07-04 | NJ | 100 | 100 |
A | 2021-07-05 | TX | 100 | 100 |
A | 2021-07-06 | TX | 100 | 200 |
B | 2021-07-01 | TX | 10 | 10 |
B | 2021-07-02 | TX | 10 | 20 |
B | 2021-07-03 | NJ | 10 | 10 |
B | 2021-07-04 | TX | 10 | 10 |
B | 2021-07-05 | NJ | 10 | 10 |
B | 2021-07-06 | NJ | 10 | 20 |
Hello @shparber_m and welcome to the SAS Support Communities!
Have you tried using the LAG function?
data want;
set have;
by Customer_ID Transaction_Date;
if first.Customer_ID or Store_Region ne lag(Store_Region) then rtotal=Amount;
else rtotal+Amount;
run;
Or, alternatively, the NOTSORTED option of the BY statement?
data want;
set have;
by Customer_ID Store_Region notsorted;
if first.Store_Region then rtotal=Amount;
else rtotal+Amount;
run;
Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.