Calcite | Level 5

## SAS - Calculating cumulative sum with conditions on date range

The following is an example of the dataset 'amounts' I have:

 startdate expirydate amount 1/1/2010 2/2/2020 10 1/5/2011 2/3/2015 10 1/3/2012 2/2/2023 10 1/4/2013 2/2/2014 10 5/5/2015 2/2/2028 10 1/6/2016 2/2/2032 10

I want to calculate the sum of all existing amounts as of each start date so it should look like this:

 startdate amount 1/1/2010 10 1/5/2011 20 1/3/2012 30 1/4/2013 40 5/5/2015 30 1/6/2016 40

How do I do this in SAS?

5 REPLIES 5
Diamond | Level 26

## Re: SAS - Calculating cumulative sum with conditions on date range

In the output data set, please explain how the AMOUNT is 40 on the fourth line and 30 on the fifth line.

--
Paige Miller
Calcite | Level 5

## Re: SAS - Calculating cumulative sum with conditions on date range

The input of 1/5/2015  in the first table was actually an error which I've changed to 5/5/2015. Essentially what I want to do is for each of the start dates, calculate the cumulative sum of any amounts that haven't expired. So for the first four dates, it is just a running cumulative sum because none of the amounts have expired. But at 5/5/2015, two of the previous amounts have expired hence a cumulative sum of 30. Same for the last date, where the same two have previously expired and you have the additional amount as of 1/6/2016 therefore 40.

Obsidian | Level 7

## Re: SAS - Calculating cumulative sum with conditions on date range

Here's a data step solution.

``````data want;
set have;
active_cum_sum = 0;

do _i = 1 to _n_;
set have (keep=expirydate amount rename=(expirydate=_expdt2 amount=_amt2)) point=_i;
if startdate < _expdt2 then active_cum_sum + _amt2;
end;

drop _:;
run;
``````
Tourmaline | Level 20

## Re: SAS - Calculating cumulative sum with conditions on date range

Hi @analyst97725  A perfect use case for a student to learn SQL self joins -

``````
data have ;
input (startdate	expirydate) (:mmddyy10.) amount ;
format startdate	expirydate date9.;
cards ;
1/1/2010	2/2/2020	10
1/5/2011	2/3/2015	10
1/3/2012	2/2/2023	10
1/4/2013	2/2/2014	10
5/5/2015	2/2/2028	10
1/6/2016	2/2/2032	10
;

proc sql ;
create table want as
select a.startdate, a.expirydate ,
sum((b.expirydate > a.startdate) * b.amount) as amount
from have a
left join have b
on a.startdate >= b.startdate
group by a.startdate, a.expirydate ;
quit ;

proc print noobs ;run ;
``````
startdate expirydate amount
01JAN2010 02FEB2020 10
05JAN2011 03FEB2015 20
03JAN2012 02FEB2023 30
04JAN2013 02FEB2014 40
05MAY2015 02FEB2028 30
06JAN2016 02FEB2032 40
PROC Star

## Re: SAS - Calculating cumulative sum with conditions on date range

I usually try to avoid sorting datasets to reduce disk activity, but this might be a worthy exception:

``````data have;
input (startdate	expirydate) (:mmddyy10.) amount ;
format startdate	expirydate date9.;
cards ;
1/1/2010	2/2/2020	10
1/5/2011	2/3/2015	10
1/3/2012	2/2/2023	10
1/4/2013	2/2/2014	10
5/5/2015	2/2/2028	10
1/6/2016	2/2/2032	10
run;

proc sort data=have (keep=expirydate amount) out=expiration;
by expirydate;
run;

data want;
set expiration (rename=(expirydate=startdate) in=inexp) have (in=inhave);
by startdate;
cum_amount + ifn(inexp,-1*amount,amount);
if inhave;
run;``````
--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------
Discussion stats
• 5 replies
• 1461 views
• 2 likes
• 5 in conversation