Hello, I would like my data to look something like this:
Area | Value | CumValue |
A | 5 | 45 |
A | 5 | 45 |
A | 15 | 45 |
A | 20 | 45 |
B | 10 | 10 |
C | 5 | 5 |
D | 10 | 25 |
D | 5 | 25 |
D | 5 | 25 |
I currently have:
By area;
if first.area then value = 0;
cumvalue + value;
run;
I've seen code that condenses all of them and spits it out so that I would only have 4 rows (A,B,C,D) but I would like to keep each row and have each equal to the last value.
Thanks!!!
If you want to do it in one data step, this will work:
data want; do until(last.area); set have; by area; cumvalue+value; /* this is a SUM statement */ end; do until(last.area); set have; by area; output; end; cumvalue=0; /* cumvalue is retained because of SUM statement */ run;
It seems that you want:
data want;
set have;
by area;
retain cumvalue;
if first.area then cumvalue = 0;
cumvalue + value;
if last.area then output;
run;
Thanks for the reply!!
Two questions -
Why isn't it - - if first.area then cumvalue = 0; (edit - just realized I have it like that in my op : O)
Also, I believe that code removes duplicates and condenses each area into one. I would like each area to remain and have the final cumulative value.
You can interleave dataset HAVE with itself, using a BY AREA statement to force the interleaving. Each AREA is read twice, the first time building the cumvalue, and the second time outputting the cumvalue:
data have;
input area $ value;
datalines;
A 5
A 5
A 15
A 20
B 10
C 5
D 10
D 5
D 5
run;
data want;
set have (in=firstpass) have;
by area;
if first.area then cumvalue=value;
else if firstpass then cumvalue+value;
if firstpass=0;
run;
The easiest way is to summarize the data and merge it back in.
This example adds the average, but you can do the same for the sum.
https://github.com/statgeek/SAS-Tutorials/blob/master/add_average_value_to_dataset.sas
data have; input area $ value; datalines; A 5 A 5 A 15 A 20 B 10 C 5 D 10 D 5 D 5 ; run; proc sql; select *,sum(value) as sum from have group by area; quit;
If you want to do it in one data step, this will work:
data want; do until(last.area); set have; by area; cumvalue+value; /* this is a SUM statement */ end; do until(last.area); set have; by area; output; end; cumvalue=0; /* cumvalue is retained because of SUM statement */ run;
Thanks so much - I was trying to merge it back in, but this works so much better!
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.