Hello, I would like my data to look something like this:
Area | Value | CumValue |
A | 5 | 45 |
A | 5 | 45 |
A | 15 | 45 |
A | 20 | 45 |
B | 10 | 10 |
C | 5 | 5 |
D | 10 | 25 |
D | 5 | 25 |
D | 5 | 25 |
I currently have:
By area;
if first.area then value = 0;
cumvalue + value;
run;
I've seen code that condenses all of them and spits it out so that I would only have 4 rows (A,B,C,D) but I would like to keep each row and have each equal to the last value.
Thanks!!!
If you want to do it in one data step, this will work:
data want; do until(last.area); set have; by area; cumvalue+value; /* this is a SUM statement */ end; do until(last.area); set have; by area; output; end; cumvalue=0; /* cumvalue is retained because of SUM statement */ run;
It seems that you want:
data want;
set have;
by area;
retain cumvalue;
if first.area then cumvalue = 0;
cumvalue + value;
if last.area then output;
run;
Thanks for the reply!!
Two questions -
Why isn't it - - if first.area then cumvalue = 0; (edit - just realized I have it like that in my op : O)
Also, I believe that code removes duplicates and condenses each area into one. I would like each area to remain and have the final cumulative value.
You can interleave dataset HAVE with itself, using a BY AREA statement to force the interleaving. Each AREA is read twice, the first time building the cumvalue, and the second time outputting the cumvalue:
data have;
input area $ value;
datalines;
A 5
A 5
A 15
A 20
B 10
C 5
D 10
D 5
D 5
run;
data want;
set have (in=firstpass) have;
by area;
if first.area then cumvalue=value;
else if firstpass then cumvalue+value;
if firstpass=0;
run;
The easiest way is to summarize the data and merge it back in.
This example adds the average, but you can do the same for the sum.
https://github.com/statgeek/SAS-Tutorials/blob/master/add_average_value_to_dataset.sas
data have; input area $ value; datalines; A 5 A 5 A 15 A 20 B 10 C 5 D 10 D 5 D 5 ; run; proc sql; select *,sum(value) as sum from have group by area; quit;
If you want to do it in one data step, this will work:
data want; do until(last.area); set have; by area; cumvalue+value; /* this is a SUM statement */ end; do until(last.area); set have; by area; output; end; cumvalue=0; /* cumvalue is retained because of SUM statement */ run;
Thanks so much - I was trying to merge it back in, but this works so much better!
Available on demand!
Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.