DATA Step, Macro, Functions and more

Cumulative Sum

Accepted Solution Solved
Reply
Occasional Contributor
Posts: 8
Accepted Solution

Cumulative Sum

Hello, I would like my data to look something like this:

Area
ValueCumValue
A545
A545
A1545
A2045
B1010
C55
D1025
D525
D525

 

I currently have:

By area;

if first.area then value = 0;

cumvalue + value;

run;

 

I've seen code that condenses all of them and spits it out so that I would only have 4 rows (A,B,C,D) but I would like to keep each row and have each equal to the last value.

 

Thanks!!!

 


Accepted Solutions
Solution
‎11-03-2017 10:45 AM
PROC Star
Posts: 274

Re: Cumulative Sum

Posted in reply to scouttier

@scouttier:

If you want to do it in one data step, this will work:

data want;
  do until(last.area);
    set have;
    by area;
    cumvalue+value; /* this is a SUM statement */
    end;
  do until(last.area);
    set have;
    by area;
    output;
    end;
  cumvalue=0; /* cumvalue is retained because of SUM statement */
run;

View solution in original post


All Replies
Trusted Advisor
Posts: 1,848

Re: Cumulative Sum

[ Edited ]
Posted in reply to scouttier

It seems that you want:

 

data want;

 set have;

  by area;

      retain cumvalue;

      if first.area then cumvalue = 0;

     cumvalue + value;

     if last.area then output;

run;

Occasional Contributor
Posts: 8

Re: Cumulative Sum

[ Edited ]

Thanks for the reply!!

Two questions -

Why isn't it  - - if first.area then cumvalue = 0; (edit - just realized I have it like that in my op : O)

 

Also, I believe that code removes duplicates and condenses each area into one. I would like each area to remain and have the final cumulative value.

Trusted Advisor
Posts: 1,848

Re: Cumulative Sum

Posted in reply to scouttier
You are right, it should be cumvalue=0.
It was a typo.
Trusted Advisor
Posts: 1,389

Re: Cumulative Sum

Posted in reply to scouttier

You can interleave dataset HAVE with itself, using a BY AREA statement to force the interleaving.  Each AREA is read twice, the first time building the cumvalue, and the second time outputting  the cumvalue:

 

data have;
  input area $ value;
datalines;
A 5  
A 5  
A 15  
A 20  
B 10  
C 5  
D 10  
D 5  
D 5  
run;

data want;
  set have (in=firstpass) have;
  by area;
  if first.area then cumvalue=value;
  else if firstpass then cumvalue+value;
  if firstpass=0;
run;
Super User
Posts: 23,980

Re: Cumulative Sum

Posted in reply to scouttier

The easiest way is to summarize the data and merge it back in. 

This example adds the average, but you can do the same for the sum.

 

https://github.com/statgeek/SAS-Tutorials/blob/master/add_average_value_to_dataset.sas

Super User
Posts: 10,848

Re: Cumulative Sum

Posted in reply to scouttier

data have;
  input area $ value;
datalines;
A 5  
A 5  
A 15  
A 20  
B 10  
C 5  
D 10  
D 5  
D 5  
;
run;

proc sql;
select *,sum(value) as sum
 from have
  group by area;
quit;

Solution
‎11-03-2017 10:45 AM
PROC Star
Posts: 274

Re: Cumulative Sum

Posted in reply to scouttier

@scouttier:

If you want to do it in one data step, this will work:

data want;
  do until(last.area);
    set have;
    by area;
    cumvalue+value; /* this is a SUM statement */
    end;
  do until(last.area);
    set have;
    by area;
    output;
    end;
  cumvalue=0; /* cumvalue is retained because of SUM statement */
run;
Occasional Contributor
Posts: 8

Re: Cumulative Sum

Thanks so much - I was trying to merge it back in, but this works so much better!

☑ This topic is solved.

Need further help from the community? Please ask a new question.

Discussion stats
  • 8 replies
  • 870 views
  • 1 like
  • 6 in conversation