The following is an example of the dataset 'amounts' I have:
startdate | expirydate | amount |
1/1/2010 | 2/2/2020 | 10 |
1/5/2011 | 2/3/2015 | 10 |
1/3/2012 | 2/2/2023 | 10 |
1/4/2013 | 2/2/2014 | 10 |
5/5/2015 | 2/2/2028 | 10 |
1/6/2016 | 2/2/2032 | 10 |
I want to calculate the sum of all existing amounts as of each start date so it should look like this:
startdate | amount |
1/1/2010 | 10 |
1/5/2011 | 20 |
1/3/2012 | 30 |
1/4/2013 | 40 |
5/5/2015 | 30 |
1/6/2016 | 40 |
How do I do this in SAS?
In the output data set, please explain how the AMOUNT is 40 on the fourth line and 30 on the fifth line.
The input of 1/5/2015 in the first table was actually an error which I've changed to 5/5/2015. Essentially what I want to do is for each of the start dates, calculate the cumulative sum of any amounts that haven't expired. So for the first four dates, it is just a running cumulative sum because none of the amounts have expired. But at 5/5/2015, two of the previous amounts have expired hence a cumulative sum of 30. Same for the last date, where the same two have previously expired and you have the additional amount as of 1/6/2016 therefore 40.
Here's a data step solution.
data want;
set have;
active_cum_sum = 0;
do _i = 1 to _n_;
set have (keep=expirydate amount rename=(expirydate=_expdt2 amount=_amt2)) point=_i;
if startdate < _expdt2 then active_cum_sum + _amt2;
end;
drop _:;
run;
Hi @analyst97725 A perfect use case for a student to learn SQL self joins -
data have ;
input (startdate expirydate) (:mmddyy10.) amount ;
format startdate expirydate date9.;
cards ;
1/1/2010 2/2/2020 10
1/5/2011 2/3/2015 10
1/3/2012 2/2/2023 10
1/4/2013 2/2/2014 10
5/5/2015 2/2/2028 10
1/6/2016 2/2/2032 10
;
proc sql ;
create table want as
select a.startdate, a.expirydate ,
sum((b.expirydate > a.startdate) * b.amount) as amount
from have a
left join have b
on a.startdate >= b.startdate
group by a.startdate, a.expirydate ;
quit ;
proc print noobs ;run ;
startdate | expirydate | amount |
---|---|---|
01JAN2010 | 02FEB2020 | 10 |
05JAN2011 | 03FEB2015 | 20 |
03JAN2012 | 02FEB2023 | 30 |
04JAN2013 | 02FEB2014 | 40 |
05MAY2015 | 02FEB2028 | 30 |
06JAN2016 | 02FEB2032 | 40 |
I usually try to avoid sorting datasets to reduce disk activity, but this might be a worthy exception:
data have;
input (startdate expirydate) (:mmddyy10.) amount ;
format startdate expirydate date9.;
cards ;
1/1/2010 2/2/2020 10
1/5/2011 2/3/2015 10
1/3/2012 2/2/2023 10
1/4/2013 2/2/2014 10
5/5/2015 2/2/2028 10
1/6/2016 2/2/2032 10
run;
proc sort data=have (keep=expirydate amount) out=expiration;
by expirydate;
run;
data want;
set expiration (rename=(expirydate=startdate) in=inexp) have (in=inhave);
by startdate;
cum_amount + ifn(inexp,-1*amount,amount);
if inhave;
run;
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.