08-05-2013 02:23 PM
How are you able to sum values cumulatively, so that all previous values are added into the most recent value?
For example, I have the following values:
However, I want the 'value' section to be added cumulatively to get this instead:
08-05-2013 02:58 PM
First you should sort by "Value". Then in a data step do something like:
if first.id then cum_value=0;
08-05-2013 03:33 PM
You should be able to do this by adding one statement:
cum_value + value;
Note that VALUE will not be affected, but CUM_VALUE will be added containing the values you are looking for.
08-06-2013 10:01 AM
Alternatively, you can take a look at lag<n>(var) functions. It allows you to access the nth previous row value of var effectively allowing you to run rolling totals like last 7 days or last month total etc. but also to achieve your desired result without the need of an additionnal column. Syntax is slightly less natural though since you need to do a case for _N_ LE n
This effectively uses your previously rolling total in variable Value (as replaced at each data step iteration) and adds it to the current row "Value" and then replaces it.
Note that the do; end; is not necessary as this is a single line statement but I put it for readability
Also note that lag1(value) is equivalent to lag(value) but again, its best for readability.
*edited after Astounding's comment. Doing clever use of lag1 pile effect to achieve the desired result. I strongly recommend against it though honestly as it's easy to get lost in piles when there are easier ways around like creating a new variable and dropping the old one.
08-06-2013 11:33 AM
Unfortunately, the LAG function is much more complex than that. It does not retrieve the VALUE from the previous observation. Rather, it retrieves VALUE from the last time that the LAG function executed. This will lead to complications, such as:
If you need to reuse the variable name VALUE, it would be simpler to go through a combination of:
Beware of LAG! It can be the right tool for some jobs, but it is extremely tricky.
08-06-2013 12:06 PM
I stand corrected. I've edited my above example although I recommend not using it. It is a not so trivial way around "pile" effect of lag function. It works though!
Here's the approach I would probably use similar to others above although saving the conditional if statement in favor of the coalesce function.