BookmarkSubscribeRSS Feed
analyst97725
Calcite | Level 5

The following is an example of the dataset 'amounts' I have:


startdateexpirydateamount
1/1/20102/2/202010
1/5/20112/3/201510
1/3/20122/2/202310
1/4/20132/2/201410
5/5/20152/2/202810
1/6/20162/2/203210

I want to calculate the sum of all existing amounts as of each start date so it should look like this:


startdateamount
1/1/201010
1/5/201120
1/3/201230
1/4/201340
5/5/201530
1/6/201640

 

How do I do this in SAS?

5 REPLIES 5
PaigeMiller
Diamond | Level 26

In the output data set, please explain how the AMOUNT is 40 on the fourth line and 30 on the fifth line.

--
Paige Miller
analyst97725
Calcite | Level 5

The input of 1/5/2015  in the first table was actually an error which I've changed to 5/5/2015. Essentially what I want to do is for each of the start dates, calculate the cumulative sum of any amounts that haven't expired. So for the first four dates, it is just a running cumulative sum because none of the amounts have expired. But at 5/5/2015, two of the previous amounts have expired hence a cumulative sum of 30. Same for the last date, where the same two have previously expired and you have the additional amount as of 1/6/2016 therefore 40.

average_joe
Obsidian | Level 7

Here's a data step solution.

 

data want;
set have;
active_cum_sum = 0;

do _i = 1 to _n_;
    set have (keep=expirydate amount rename=(expirydate=_expdt2 amount=_amt2)) point=_i;
    if startdate < _expdt2 then active_cum_sum + _amt2;
end;

drop _:;
run;
novinosrin
Tourmaline | Level 20

Hi @analyst97725  A perfect use case for a student to learn SQL self joins -


data have ;
  input (startdate	expirydate) (:mmddyy10.) amount ;
  format startdate	expirydate date9.;
cards ;
1/1/2010	2/2/2020	10
1/5/2011	2/3/2015	10
1/3/2012	2/2/2023	10
1/4/2013	2/2/2014	10
5/5/2015	2/2/2028	10
1/6/2016	2/2/2032	10
;

proc sql ;
  create table want as
  select a.startdate, a.expirydate ,
         sum((b.expirydate > a.startdate) * b.amount) as amount
  from have a
  left join have b
  on a.startdate >= b.startdate 
  group by a.startdate, a.expirydate ;
quit ;

proc print noobs ;run ;
startdate expirydate amount
01JAN2010 02FEB2020 10
05JAN2011 03FEB2015 20
03JAN2012 02FEB2023 30
04JAN2013 02FEB2014 40
05MAY2015 02FEB2028 30
06JAN2016 02FEB2032 40
mkeintz
PROC Star

I usually try to avoid sorting datasets to reduce disk activity, but this might be a worthy exception:

 

data have;
  input (startdate	expirydate) (:mmddyy10.) amount ;
  format startdate	expirydate date9.;
cards ;
1/1/2010	2/2/2020	10
1/5/2011	2/3/2015	10
1/3/2012	2/2/2023	10
1/4/2013	2/2/2014	10
5/5/2015	2/2/2028	10
1/6/2016	2/2/2032	10
run;

proc sort data=have (keep=expirydate amount) out=expiration;
  by expirydate;
run;

data want;
  set expiration (rename=(expirydate=startdate) in=inexp) have (in=inhave);
  by startdate;
  cum_amount + ifn(inexp,-1*amount,amount);
  if inhave;
run;
--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------

hackathon24-white-horiz.png

2025 SAS Hackathon: There is still time!

Good news: We've extended SAS Hackathon registration until Sept. 12, so you still have time to be part of our biggest event yet – our five-year anniversary!

Register Now

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 5 replies
  • 2641 views
  • 2 likes
  • 5 in conversation