Auto-suggest helps you quickly narrow down your search results by suggesting possible matches as you type.

Showing results for

- Home
- /
- Programming
- /
- Programming
- /
- Cumulative Amount based on sequential dates

- RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Mute
- Printer Friendly Page

☑ This topic is **solved**.
Need further help from the community? Please
sign in and ask a **new** question.

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

Posted 01-24-2022 10:03 AM
(655 views)

Hello there,

I would like to calculate the cumulative amount based on sequential dates.

Please see below the data I have:

```
data have;
input ID $ DT:date9. Amount;
format DT date9.;
datalines;
A 09JUL2021 3600
A 03AUG2021 456
A 04AUG2021 33
A 06AUG2021 235
A 07AUG2021 100
A 09AUG2021 86
A 12AUG2021 456
A 24AUG2021 22
A 25AUG2021 987
A 26AUG2021 916
A 27AUG2021 81
;
run;
```

I want a variable that represents the cumulative amount of the sequential dates. E.g., `03AUG2021 & 04AUG2021`

are sequential and therefore the cumulative amount is 489.

Please see below the data I want:

```
data want;
input ID $ DT:date9. Amount Variable_Wanted;
format DT date9.;
datalines;
A 09JUL2021 3600 0
A 03AUG2021 456 0
A 04AUG2021 33 489
A 06AUG2021 235 0
A 07AUG2021 100 335
A 09AUG2021 86 0
A 12AUG2021 456 0
A 24AUG2021 22 0
A 25AUG2021 987 1009
A 26AUG2021 916 1903
A 27AUG2021 81 997
;
run;
```

Ideally, the user should be able to select the number of sequential dates to include.

Please see below the closest I can make:

```
data wanted;
set have;
new_Amount = 0;
by ID DT;
if dif(DT) <= 1 then /*how many sequential*/
do;
new_Amount = Amount;
end;
run;
```

Any help would much be appreciated.

Thanks.

1 ACCEPTED SOLUTION

Accepted Solutions

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

how about this code.

```
proc sort data=have;
by id dt;
run;
data want;
set have;
by id dt;
retain sum amt dt0;
if dt=sum(dt0,1) then sum=sum(amt,amount);
else sum=0;
amt=amount;
dt0=dt;
drop amt dt0;
run;
```

16 REPLIES 16

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

how about this code.

```
proc sort data=have;
by id dt;
run;
data want;
set have;
by id dt;
retain sum amt dt0;
if dt=sum(dt0,1) then sum=sum(amt,amount);
else sum=0;
amt=amount;
dt0=dt;
drop amt dt0;
run;
```

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

Hi,
I didnt consider everyting and I am wondering if I could ask some additinal help.
As is, the program calculates the cumulative amount only for 1 sequential dates. For instance, 03AUG2021 & 04AUG2021 which is 489.
I also wanted to be able to calculate the cumulative amount for more than 1 sequential dates. For example:
- for two sequential dates: 04AUG2021 & 06AUG2021 = 268
- for three sequential dates: 25AUG2021 & 26AUG2021 & 27AUG2021 = 3909.
I tried to change the dt=sum(dt0,2), but it doesnt work because it will not consider any dates between, meaning it will calculate the cumulative amount for only the pairs: 04AUG2021,06AUG2021 and 07AUG2021,09AUG2021.
Any help please?

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

Your initial question (summing only two days) can be solved like this:

```
data want;
set have;
by id dt;
variable_wanted = ifn(
first.id or lag(dt) ne dt - 1,
0,
amount + lag(amount)
);
run;
```

A running total over any arbitrary number of consecutive days goes like this:

```
data want;
set have;
by id dt;
if first.id or lag(dt) ne dt - 1
then do;
variable_wanted = 0;
output;
variable_wanted = amount;
end;
else do;
variable_wanted + amount;
output;
end;
run;
```

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

Hi, thank you so much for your reply.
It is not exaclty what I wanted.
How can I get the sum of the next, lets say, 2 days of each day?
This is what I was looking for (when I want to find the sum of 2 days in a row):
data what_I_want;
input ID $ DT:date9. Amount Sum_of_Next_2_Days;
format DT date9.;
datalines;
A 09JUL2021 3600 0
A 03AUG2021 456 0
A 04AUG2021 33 489
A 06AUG2021 235 268
A 07AUG2021 100 335
A 09AUG2021 86 186
A 12AUG2021 456 0
A 24AUG2021 22 0
A 25AUG2021 987 0
A 26AUG2021 916 1925
A 27AUG2021 81 1984
;
run;
It should work similarly if I want to find the sum of 3 days in a row and so on.
Thanks.

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

Is this to be based on the same example data you have in your initial post?

And please don't forget to use code boxes for your code, otherwise it gets scrambled, as you can see 😉

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

Hello,

Below my data:

```
data what_I_want;
input ID $ DT:date9. Amount Sum_of_Next_2_Days;
format DT date9.;
datalines;
A 09JUL2021 3600 3600
A 03AUG2021 456 489
A 04AUG2021 33 268
A 06AUG2021 235 335
A 07AUG2021 100 186
A 09AUG2021 86 86
A 12AUG2021 456 456
A 24AUG2021 22 1925
A 25AUG2021 987 1984
A 26AUG2021 916 1551
A 27AUG2021 81 828
B 07AUG2021 554 992
B 08AUG2021 193 438
B 09AUG2021 245 245
;
run;
```

Along the line I would like something like that.

For each date, I want to find the sum of its next 2 days. Find for next 2 dates: For 09Jul, sum Amount where DT Between 09Jul and 11Jul For 03Aug, sum Amount where DT Between 03Aug and 05Aug For 04Aug, sum Amount where DT Between 04Aug and 06Aug For 06Aug, sum Amount where DT Between 06Aug and 08Aug For 07Aug, sum Amount where DT Between 07Aug and 09Aug For 09Aug, sum Amount where DT Between 09Aug and 11Aug For 12Aug, sum Amount where DT Between 12Aug and 14Aug For 24Aug, sum Amount where DT Between 24Aug and 26Aug For 25Aug, sum Amount where DT Between 25Aug and 27Aug For 26Aug, sum Amount where DT Between 26Aug and 28Aug For 27Aug, sum Amount where DT Between 27Aug and 29Aug For 07Aug, sum Amount where DT Between 07Aug and 09Aug For 08Aug, sum Amount where DT Between 08Aug and 10Aug For 09Aug, sum Amount where DT Between 09Aug and 11Aug

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

Simple solution in SQL:

```
proc sql;
create table want as
select
t1.id, t1.dt, t1.amount,
(
select sum(amount)
from what_i_want t2
where t1.id = t2.id and t1.dt le t2.dt le t1.dt + 2
) as sum_of_next_2_days
from what_i_want t1
;
quit;
```

If this turns out to be non-feasible because of bad performance (in case of a large input dataset), we would need to think of a data step solution that uses a date-based array. This data step would then read the data in one sequential pass.

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

Thanks for the code.

It works but as you mentioned it is not efficient with big data.

I can split the dataset and run it in smaller data.

But wondered if there is any other way.

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

See the array-based approach here:

```
data have;
input ID $ DT:date9. Amount;
format DT date9.;
datalines;
A 09JUL2021 3600
A 03AUG2021 456
A 04AUG2021 33
A 06AUG2021 235
A 07AUG2021 100
A 09AUG2021 86
A 12AUG2021 456
A 24AUG2021 22
A 25AUG2021 987
A 26AUG2021 916
A 27AUG2021 81
B 07AUG2021 554
B 08AUG2021 193
B 09AUG2021 245
;
%let start = %sysfunc(inputn(19000101,yymmdd8.));
%let end = %sysfunc(inputn(20991231,yymmdd8.));
data want;
array __amount {&start.:&end.};
do until (last.id);
set have;
by id dt;
if first.id then __start = dt;
__amount{dt} = amount;
end;
do dt = __start to dt;
if __amount{dt} ne .
then do;
amount = __amount{dt};
sum_of_next_2_days = sum(__amount{dt},__amount{dt+1},__amount{dt+2});
output;
end;
end;
drop __:;
run;
```

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

Thank you it works and it is really fast.

I guess I would have to learn more about arrays.

Thanks again.

I guess I would have to learn more about arrays.

Thanks again.

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

While it may not be obvious at first, an array indexed by dates is not really that large (if you set reasonable start and end dates) and can easily fit into available memory.

Even if you span over all possible dates, a numeric array would be about 24 MB in size:

```
data _null_;
x = 8 * ('31dec9999'd - '01jan1582'd);
put x=;
run;
```

(make it temporary so that no individual variable names need to be kept, which would massively increase the memory consumption)

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

Hi, hope you are well.

It is a quite old post, but I wondered if you could help on the below.

How can this approach be modified in order to consider only consecutive days, i.e., days happen one after each other without breaks.

For example, if we choose to consider only 3 consecutive days, the data will be as below:

```
data have;
input ID $ DT:date9. Amount Concecutive_Amount;
format DT date9.;
datalines;
A 09JUL2021 3600 .
A 03AUG2021 456 489
A 04AUG2021 33 .
A 06AUG2021 235 335
A 07AUG2021 100 .
A 09AUG2021 86 .
A 12AUG2021 456 .
A 24AUG2021 22 1925
A 25AUG2021 987 1984
A 26AUG2021 916 997
A 27AUG2021 81 .
B 07AUG2021 554 992
B 08AUG2021 193 438
B 09AUG2021 245
;
```

Any help would be much appreciated.

(in case it's better to create a new topic please let me know)

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

Do the summation only when the next day is non-missing.

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

May I ask to help me on this? That was my initial thought too but I cannot make it..

Registration is open! SAS is returning to Vegas for an AI and analytics experience like no other! Whether you're an executive, manager, end user or SAS partner, SAS Innovate is designed for everyone on your team. Register for just $495 by 12/31/2023.

**If you are interested in speaking, there is still time to submit a session idea. More details are posted on the website. **

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.